reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Neuroplastic Expansion in Deep Reinforcement Learning

Authors: Jiashun Liu, Johan S Obando Ceron, Aaron Courville, Ling Pan

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that NE effectively mitigates plasticity loss and outperforms state-of-the-art methods across various tasks in Mu Jo Co and Deep Mind Control Suite environments.
Researcher Affiliation	Academia	Jiashun Liu HKUST Johan Obando-Ceron Mila Qu ebec AI Institute Universit e de Montr eal Aaron Courville Mila Qu ebec AI Institute Universit e de Montr eal Ling Pan HKUST Corresponding author, email: EMAIL
Pseudocode	Yes	D PSEUDOCODE CODE D.1 PSEUDO-CODE FOR NE Algorithm 1 Neuroplastic Expansion TD3 (...) D.2 PSEUDO-CODE FOR TRUNCATE PROCESS Algorithm 2 Truncate Process
Open Source Code	Yes	We make our code publicly available.
Open Datasets	Yes	Extensive experiments demonstrate that NE effectively mitigates plasticity loss and outperforms state-of-the-art methods across various tasks in Mu Jo Co and Deep Mind Control Suite environments. (...) We conduct a series of experiments based on the standard continuous control tasks from Open AI Gym (Brockman, 2016) simulated by Mu Jo Co (Todorov et al., 2012) with long-term training setting, i.e. 3M steps 6M.
Dataset Splits	No	The paper describes training in various environments for a certain number of steps (e.g., "3M steps 6M") and samples from a replay buffer, but it does not provide explicit training/test/validation dataset splits in the conventional sense for supervised learning.
Hardware Specification	Yes	Our codes are implemented with Python 3.8 and Torch 1.12.1. All experiments were run on NVIDIA Ge Force GTX 3090 GPUs.
Software Dependencies	Yes	Our codes are implemented with Python 3.8 and Torch 1.12.1.
Experiment Setup	Yes	The hyper-parameters for TD3 are presented in Table 2. (...) For Humanoid and Ant tasks, we set grow interval T = 25000, grow number k = 0.01 rest capacity, Prune upper bond ω = 0.4, ending step is the max training step, the threshold of ER is 0.35 and the decay weight α = 0.02(which is used in all the tasks). For other Open AI Mujoco tasks, we set grow interval T = 20000, grow number k = 0.15 rest capacity, Prune upper bond ω = 0.2, ending step is the max training step, the threshold of ER is 0.25.