reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Datasets and Benchmarks for Offline Safe Reinforcement Learning

Authors: Zuxin Liu, Zijian Guo, Haohong Lin, Yihang Yao, Jiacheng Zhu, Zhepeng Cen, Hanjiang Hu, Wenhao Yu, Tingnan Zhang, Jie Tan, Ding Zhao

DMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through extensive experiments with over 50000 CPU and 800 GPU hours of computations, we evaluate and compare the performance of these baseline algorithms on the collected datasets, offering insights into their strengths, limitations, and potential areas of improvement.
Researcher Affiliation	Collaboration	1Carnegie Mellon University, 2Google Deepmind
Pseudocode	No	The paper describes algorithms and methods in prose and tables but does not include any clearly labeled pseudocode or algorithm blocks with structured steps.
Open Source Code	Yes	Our dataset and benchmark are accessible through the following URL: www.offline-saferl. org. We provide three open-sourced packages, FSRL 1 for expert safe RL policies, DSRL 2 for managing datasets and environment wrappers, and OSRL 3 for offline safe learning algorithms. 1. https://github.com/liuzuxin/FSRL 2. https://github.com/liuzuxin/DSRL 3. https://github.com/liuzuxin/OSRL
Open Datasets	Yes	Our dataset and benchmark are accessible through the following URL: www.offline-saferl. org. ... The datasets will be hosted on our designated platform accessible via the DSRL package. They are also directly downloadable at http://data.offline-saferl.org/download. All datasets are licensed under the Creative Commons Attribution 4.0 License (CC BY).
Dataset Splits	No	The paper describes generating and manipulating datasets, and evaluating algorithms on these datasets using different cost thresholds and random seeds. However, it does not explicitly define or specify training, validation, or test splits for the datasets themselves.
Hardware Specification	Yes	Except for the experiments for CDT, which are conducted with NVIDIA A100 GPUs, all other experiments are conducted with AMD EPYC 7542 32-Core CPUs or Intel Xeon CPUs with 4 threads.
Software Dependencies	No	The FSRL (Fast Safe Reinforcement Learning) package provides modularized implementations of safe RL algorithms based on Py Torch (Paszke et al., 2019) and the Tianshou framework (Weng et al., 2021). The paper mentions software like PyTorch and Tianshou but does not specify their version numbers.
Experiment Setup	Yes	Detailed hyperparameter configurations are provided in Appendix A.4. ... The primary hyperparameters employed in the experiments are summarized in Table 5, and more algorithm-specific parameters can be found in the Git Hub repository.