reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

BenchMARL: Benchmarking Multi-Agent Reinforcement Learning

Authors: Matteo Bettini, Amanda Prorok, Vincent Moens

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this paper, we introduce Bench MARL, the ﬁrst MARL training library created to enable standardized benchmarking across diﬀerent algorithms, models, and environments. ... Appendix A. Experiment results In this section, we report some experiments to conﬁrm the correctness of the implementations in the library and provide public benchmarking references. Experiment results, aggregated over all tasks, are reported in Fig. 2. Individual task results are reported in Fig. 3.
Researcher Affiliation	Collaboration	Matteo Bettini ,1 EMAIL Amanda Prorok1 EMAIL Vincent Moens2 EMAIL ... 1 Department of Computer Science and Technology, University of Cambridge, United Kingdom. 2 Py Torch Team, Meta.
Pseudocode	No	The paper does not contain any explicit pseudocode or algorithm blocks. It describes the design and features of the Bench MARL library but does not present any algorithm in a structured, pseudocode format.
Open Source Code	Yes	Bench MARL is open-sourced on Git Hub: https://github.com/facebookresearch/Bench MARL.
Open Datasets	Yes	Table 2: Environments in Bench MARL. Renderings are shown in Fig. 4. Environment Tasks Cooperation Global state Reward function Action space Vectorized VMAS (Bettini et al., 2022) 27 Cooperative + Competitive No Shared + Independent + Global Continuous + Discrete Yes SMACv2 (Ellis et al., 2022) 15 Cooperative Yes Global Discrete No MPE (Lowe et al., 2017) 8 Cooperative + Competitive Yes Shared + Independent + Global Continuous + Discrete No SISL (Gupta et al., 2017) 2 Cooperative No Shared Continuous No Melting Pot (Leibo et al., 2021) 49 Cooperative + Competitive Yes Independent Discrete No
Dataset Splits	No	The paper does not provide specific training/test/validation dataset splits. Reinforcement learning experiments typically involve agents interacting with environments rather than being trained on pre-split static datasets. The paper mentions running experiments with '3 random seeds' for statistical robustness, which is not equivalent to dataset splits for static data.
Hardware Specification	No	The paper mentions 'batched collection on GPU devices' but does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies	No	The paper mentions using 'Torch RL' as its backend and 'Py Torch', along with 'Hydra' for configuration. However, it does not specify concrete version numbers for any of these software dependencies or other libraries.
Experiment Setup	No	The paper states that 'All the algorithms, models, and tasks were run using the default Bench MARL conﬁguration, available in the conf folder. The experiment hyperparameters are available in the fine tuned /vmas folder.' While it refers to where hyperparameters can be found, it does not explicitly list them or other specific training configurations in the main text, beyond mentioning the use of '3 random seeds' for statistical evaluation.