reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning Progress Driven Multi-Agent Curriculum

Authors: Wenshuai Zhao, Zhiyuan Li, Joni Pajarinen

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our methods on three benchmarks including MPE Simple-Spread (Lowe et al., 2017), the XOR matrix game (Fu et al., 2022), and four SMAC-v2 Protoss tasks (Ellis et al., 2022). The results show that the number of agents can serve as an effective curriculum variable to facilitate exploration. ... Our experiments on three distinct benchmarks demonstrate that SPMARL outperforms baseline methods.
Researcher Affiliation	Academia	1Department of Electrical Engineering and Automation, Aalto University, Espoo, Finland. Correspondence to: Wenshuai Zhao <EMAIL>.
Pseudocode	Yes	Algorithm 1 Self-Paced Multi-Agent Reinforcement Learning (SPMARL)
Open Source Code	Yes	1Source Code: https://github.com/wenshuaizhao/spmarl
Open Datasets	Yes	We evaluate our method on three challenging benchmarks with severe sparse rewards, including (1) MPE Simple-Spread task (Lowe et al., 2017), (2) XOR game (Fu et al., 2022), and (3) four SMAC-v2 Protoss tasks (Ellis et al., 2022). ... We further evaluate our method on two additional tasks from the recent Bench MARL benchmark (Bettini et al., 2024).
Dataset Splits	No	The paper does not provide specific training/test/validation dataset splits in terms of explicit percentages, sample counts, or citations to predefined data partitions. For reinforcement learning, data is generated through interaction, and the paper focuses on curriculum generation by varying environment parameters rather than fixed dataset splits.
Hardware Specification	No	We acknowledge CSC IT Center for Science, Finland, for awarding this project access to the LUMI supercomputer, owned by the Euro HPC Joint Undertaking, hosted by CSC (Finland) and the LUMI consortium through CSC. We also acknowledge the computational resources provided by the Aalto Science-IT project and funding by Research Council of Finland (357301). While LUMI is a specific supercomputer, the paper does not provide specific details on GPU/CPU models or other detailed hardware specifications for the experiments.
Software Dependencies	No	The paper mentions algorithmic components like 'Multi-agent PPO (MAPPO)' and 'optimizer Adam', and provides hyperparameters for these. However, it does not list specific software dependencies with version numbers (e.g., 'Python 3.8, PyTorch 1.9').
Experiment Setup	Yes	The hyper-parameters used in our experiments are listed in the appendix. ... Table 3. Common hyper-parameters of MAPPO across all domains ... Table 4. Common hyper-parameters of SPMARL across all domains ... Table 5. Hyper-parameters for MPE ... Table 6. Hyper-parameters for XOR ... Table 7. Hyper-parameters for SMAC v2 tasks ... Table 8. Hyper-parameters for Bench MARL tasks