reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Efficient Diversity-based Experience Replay for Deep Reinforcement Learning

Authors: Kaiyan Zhao, Yiming Wang, Yuyang Chen, Yan Li, Leong Hou U, Xiaoguang Niu

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments are conducted on robotic manipulation tasks in Mu Jo Co, Atari games, and realistic indoor environments in Habitat. The results demonstrate that our approach not only significantly improves learning efficiency but also achieves superior performance in high-dimensional, realistic environments.
Researcher Affiliation	Academia	1School of Computer Science, Wuhan University, Wuhan, China 2State Key Laboratory of Internet of Things for Smart City, University of Macau, Macao, China 3School of Professional Education, Northwestern University, USA 4School of Artificial Intelligence, Shenzhen Polytechnic University, China EMAIL,EMAIL,EMAIL EMAIL,EMAIL
Pseudocode	Yes	Algorithm 1 EDER
Open Source Code	No	Details are available at https://arxiv.org/abs/2410.20487. (Explanation: This is a link to an arXiv preprint, not a code repository, and there is no explicit statement about code release.)
Open Datasets	Yes	Extensive experiments are conducted on robotic manipulation tasks in Mu Jo Co, Atari games, and realistic indoor environments in Habitat. We evaluate EDER in three environments from the Habitat-Matterport 3D Research Dataset (HM3D) [Ramakrishnan et al., 2021]
Dataset Splits	No	The paper uses reinforcement learning environments (Mujoco, Atari, Habitat) where data is generated through agent-environment interaction. It does not provide specific training/test/validation splits for a static dataset.
Hardware Specification	No	Work partially performed on the supercomputing system at the Supercomputing Center of Wuhan University and at SICC (supported by SKL-IOTSC, University of Macau). (Explanation: This statement is too general, mentioning "supercomputing system" without specific hardware details like GPU/CPU models.)
Software Dependencies	No	The paper mentions several algorithms (DDPG, DQN) and environments (Mujoco, Atari, Habitat) but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	In our framework, we partition T into multiple partial trajectories of length b, denoted as τj, each covering a state transition from t = js to t = js + b 1, where s represents the sliding step length. The trajectories are quantified by sliding the window of length s = b, where the meticulous segmentation allows us to analyze and understand the behavioral patterns of intelligent agents at different stages. The specific formula is as follows: ... For clarity, we set a sliding window of b = 2 in this part, while other values are explored in the ablation studies.