Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research
Authors: Jiaming Ji, Jiayi Zhou, Borong Zhang, Juntao Dai, Xuehai Pan, Ruiyang Sun, Weidong Huang, Yiran Geng, Mickel Liu, Yaodong Yang
JMLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The algorithms implemented Omni Safe have been rigorously tested in Safety-Gym (Ray et al., 2019) and Mujoco-Velocity (Zhang et al., 2020) environments to confirm their consistency with the results presented in the original papers. |
| Researcher Affiliation | Academia | Jiaming Ji EMAIL Jiayi Zhou EMAIL Borong Zhang EMAIL Juntao Dai EMAIL Xuehai Pan EMAIL Ruiyang Sun EMAIL Weidong Huang EMAIL Yiran Geng EMAIL Mickel Liu EMAIL Yaodong Yang EMAIL Institute for Artificial Intelligence, Peking University, China |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. Figure 2 depicts a high-level dataflow process, but it is a diagram, not a pseudocode listing. |
| Open Source Code | Yes | Our project is released at: https://github.com/PKU-Alignment/omnisafe. |
| Open Datasets | Yes | The algorithms implemented Omni Safe have been rigorously tested in Safety-Gym (Ray et al., 2019) and Mujoco-Velocity (Zhang et al., 2020) environments to confirm their consistency with the results presented in the original papers. |
| Dataset Splits | No | The paper describes experiments conducted in reinforcement learning environments (Safety-Gym, Mujoco-Velocity) where data is generated through agent interaction, rather than using pre-collected datasets with explicit train/test/validation splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions 'torch.distributed' for parallel computing, and 'Pylint, and My Py' for code quality, but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | No | The paper is an infrastructure paper describing a framework, OmniSafe. It discusses high-level training commands but does not provide specific experimental setup details such as hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed training configurations within the main text. |