Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]

OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research

Authors: Jiaming Ji, Jiayi Zhou, Borong Zhang, Juntao Dai, Xuehai Pan, Ruiyang Sun, Weidong Huang, Yiran Geng, Mickel Liu, Yaodong Yang

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The algorithms implemented Omni Safe have been rigorously tested in Safety-Gym (Ray et al., 2019) and Mujoco-Velocity (Zhang et al., 2020) environments to confirm their consistency with the results presented in the original papers.
Researcher Affiliation Academia Jiaming Ji EMAIL Jiayi Zhou EMAIL Borong Zhang EMAIL Juntao Dai EMAIL Xuehai Pan EMAIL Ruiyang Sun EMAIL Weidong Huang EMAIL Yiran Geng EMAIL Mickel Liu EMAIL Yaodong Yang EMAIL Institute for Artificial Intelligence, Peking University, China
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks. Figure 2 depicts a high-level dataflow process, but it is a diagram, not a pseudocode listing.
Open Source Code Yes Our project is released at: https://github.com/PKU-Alignment/omnisafe.
Open Datasets Yes The algorithms implemented Omni Safe have been rigorously tested in Safety-Gym (Ray et al., 2019) and Mujoco-Velocity (Zhang et al., 2020) environments to confirm their consistency with the results presented in the original papers.
Dataset Splits No The paper describes experiments conducted in reinforcement learning environments (Safety-Gym, Mujoco-Velocity) where data is generated through agent interaction, rather than using pre-collected datasets with explicit train/test/validation splits.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper mentions 'torch.distributed' for parallel computing, and 'Pylint, and My Py' for code quality, but does not provide specific version numbers for these software dependencies.
Experiment Setup No The paper is an infrastructure paper describing a framework, OmniSafe. It discusses high-level training commands but does not provide specific experimental setup details such as hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed training configurations within the main text.