Learning Progress Driven Multi-Agent Curriculum
Authors: Wenshuai Zhao, Zhiyuan Li, Joni Pajarinen
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our methods on three benchmarks including MPE Simple-Spread (Lowe et al., 2017), the XOR matrix game (Fu et al., 2022), and four SMAC-v2 Protoss tasks (Ellis et al., 2022). The results show that the number of agents can serve as an effective curriculum variable to facilitate exploration. ... Our experiments on three distinct benchmarks demonstrate that SPMARL outperforms baseline methods. |
| Researcher Affiliation | Academia | 1Department of Electrical Engineering and Automation, Aalto University, Espoo, Finland. Correspondence to: Wenshuai Zhao <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Self-Paced Multi-Agent Reinforcement Learning (SPMARL) |
| Open Source Code | Yes | 1Source Code: https://github.com/wenshuaizhao/spmarl |
| Open Datasets | Yes | We evaluate our method on three challenging benchmarks with severe sparse rewards, including (1) MPE Simple-Spread task (Lowe et al., 2017), (2) XOR game (Fu et al., 2022), and (3) four SMAC-v2 Protoss tasks (Ellis et al., 2022). ... We further evaluate our method on two additional tasks from the recent Bench MARL benchmark (Bettini et al., 2024). |
| Dataset Splits | No | The paper does not provide specific training/test/validation dataset splits in terms of explicit percentages, sample counts, or citations to predefined data partitions. For reinforcement learning, data is generated through interaction, and the paper focuses on curriculum generation by varying environment parameters rather than fixed dataset splits. |
| Hardware Specification | No | We acknowledge CSC IT Center for Science, Finland, for awarding this project access to the LUMI supercomputer, owned by the Euro HPC Joint Undertaking, hosted by CSC (Finland) and the LUMI consortium through CSC. We also acknowledge the computational resources provided by the Aalto Science-IT project and funding by Research Council of Finland (357301). While LUMI is a specific supercomputer, the paper does not provide specific details on GPU/CPU models or other detailed hardware specifications for the experiments. |
| Software Dependencies | No | The paper mentions algorithmic components like 'Multi-agent PPO (MAPPO)' and 'optimizer Adam', and provides hyperparameters for these. However, it does not list specific software dependencies with version numbers (e.g., 'Python 3.8, PyTorch 1.9'). |
| Experiment Setup | Yes | The hyper-parameters used in our experiments are listed in the appendix. ... Table 3. Common hyper-parameters of MAPPO across all domains ... Table 4. Common hyper-parameters of SPMARL across all domains ... Table 5. Hyper-parameters for MPE ... Table 6. Hyper-parameters for XOR ... Table 7. Hyper-parameters for SMAC v2 tasks ... Table 8. Hyper-parameters for Bench MARL tasks |