reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Discrete GCBF Proximal Policy Optimization for Multi-agent Safe Optimal Control

Authors: Songyuan Zhang, Oswin So, Mitchell Black, Chuchu Fan

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically validate our claims on a suite of multi-agent tasks spanning three different simulation engines. The results suggest that, compared with existing methods, our DGPPO framework obtains policies that achieve high task performance (matching baselines that ignore the safety constraints), and high safety rates (matching the most conservative baselines), with a constant set of hyperparameters across all environments.
Researcher Affiliation	Academia	Department of Aeronautics and Astronautics, MIT MIT Lincoln Laboratory EMAIL EMAIL
Pseudocode	No	The paper presents a diagram in Figure 1 labeled "DGPPO algorithm" but does not include structured pseudocode or an algorithm block.
Open Source Code	Yes	The code of our algorithm and the baselines are provided in the dgppo.zip file in the supplementary materials and online at https://github.com/MIT-REALM/dgppo.
Open Datasets	Yes	Environments. We evaluate DGPPO in a wide range of environments including four Li DAR environments (TARGET, SPREAD, LINE, BICYCLE) where the agents use Li DAR to detect obstacles (Keyumarsi et al., 2023), one Mu Jo Co environment TRANSPORT (Todorov et al., 2012), and two VMAS environments (TRANSPORT2, WHEEL) (Bettini et al., 2022; 2024).
Dataset Splits	No	The paper mentions evaluating each run on "32 different initial conditions" but does not specify traditional training, validation, or test dataset splits, as the data is generated within simulation environments rather than being a static dataset.
Hardware Specification	Yes	The experiments are run on a 13th Gen Intel(R) Core(TM) i7-13700KF CPU with 64GB RAM and an NVIDIA Ge Force RTX 4090 GPU.
Software Dependencies	No	The paper mentions using JAX (Bradbury et al., 2018) for implementing baselines but does not provide specific version numbers for JAX or other key software dependencies.
Experiment Setup	Yes	In Table 1, we provide the value of the common hyperparameters for DGPPO and the baselines. Besides these common hyperparameters, the value of the unique hyperparameters of DGPPO are provided in Table 2.