reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Iteratively Refined Behavior Regularization for Offline Reinforcement Learning

Authors: Yi Ma, Jianye Hao, Xiaohan Hu, YAN ZHENG, Chenjun Xiao

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on the D4RL benchmark indicate that our method outperforms previous state-of-the-art baselines in most tasks, clearly demonstrate its superiority over behavior regularization.
Researcher Affiliation	Collaboration	Yi Ma1,2, Jianye Hao3,4 , Xiaohan Hu3, Yan Zheng3 Chenjun Xiao5 1School of Computer and Information Technology, Shanxi University, EMAIL 2Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education 3College of Intelligence and Computing, Tianjin University EMAIL 4Noah s Ark Lab, Huawei 5The Chinese University of Hongkong, Shenzhen, EMAIL
Pseudocode	Yes	We give the pseudocode of both CPI and CPI-RE in Algorithm 1.
Open Source Code	Yes	Codes are provided in this link https://github.com/mamengyiyi/CPI.
Open Datasets	Yes	Experimental results on the D4RL benchmark indicate that our method outperforms previous state-of-the-art baselines in most tasks...
Dataset Splits	No	The paper states it uses D4RL benchmarks but does not explicitly provide details on training/validation/test dataset splits, such as percentages or specific sample counts for each split.
Hardware Specification	Yes	All experiments are run on a GeForce GTX 2080TI GPU.
Software Dependencies	No	The paper mentions using TD3+BC as a base for modifications and refers to a GitHub repository, but it does not specify explicit version numbers for programming languages (e.g., Python), deep learning frameworks (e.g., PyTorch), or other ancillary software libraries.
Experiment Setup	Yes	Table 3: CPI Hyperparameters and Table 4: Regularization parameter τ and weighting factor λ of CPI for all datasets detail the experimental setup, including specific hyperparameter values.