Learning Progress Driven Multi-Agent Curriculum

Authors: Wenshuai Zhao, Zhiyuan Li, Joni Pajarinen

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our methods on three benchmarks including MPE Simple-Spread (Lowe et al., 2017), the XOR matrix game (Fu et al., 2022), and four SMAC-v2 Protoss tasks (Ellis et al., 2022). The results show that the number of agents can serve as an effective curriculum variable to facilitate exploration. ... Our experiments on three distinct benchmarks demonstrate that SPMARL outperforms baseline methods.
Researcher Affiliation Academia 1Department of Electrical Engineering and Automation, Aalto University, Espoo, Finland. Correspondence to: Wenshuai Zhao <EMAIL>.
Pseudocode Yes Algorithm 1 Self-Paced Multi-Agent Reinforcement Learning (SPMARL)
Open Source Code Yes 1Source Code: https://github.com/wenshuaizhao/spmarl
Open Datasets Yes We evaluate our method on three challenging benchmarks with severe sparse rewards, including (1) MPE Simple-Spread task (Lowe et al., 2017), (2) XOR game (Fu et al., 2022), and (3) four SMAC-v2 Protoss tasks (Ellis et al., 2022). ... We further evaluate our method on two additional tasks from the recent Bench MARL benchmark (Bettini et al., 2024).
Dataset Splits No The paper does not provide specific training/test/validation dataset splits in terms of explicit percentages, sample counts, or citations to predefined data partitions. For reinforcement learning, data is generated through interaction, and the paper focuses on curriculum generation by varying environment parameters rather than fixed dataset splits.
Hardware Specification No We acknowledge CSC IT Center for Science, Finland, for awarding this project access to the LUMI supercomputer, owned by the Euro HPC Joint Undertaking, hosted by CSC (Finland) and the LUMI consortium through CSC. We also acknowledge the computational resources provided by the Aalto Science-IT project and funding by Research Council of Finland (357301). While LUMI is a specific supercomputer, the paper does not provide specific details on GPU/CPU models or other detailed hardware specifications for the experiments.
Software Dependencies No The paper mentions algorithmic components like 'Multi-agent PPO (MAPPO)' and 'optimizer Adam', and provides hyperparameters for these. However, it does not list specific software dependencies with version numbers (e.g., 'Python 3.8, PyTorch 1.9').
Experiment Setup Yes The hyper-parameters used in our experiments are listed in the appendix. ... Table 3. Common hyper-parameters of MAPPO across all domains ... Table 4. Common hyper-parameters of SPMARL across all domains ... Table 5. Hyper-parameters for MPE ... Table 6. Hyper-parameters for XOR ... Table 7. Hyper-parameters for SMAC v2 tasks ... Table 8. Hyper-parameters for Bench MARL tasks