MA$^2$E: Addressing Partial Observability in Multi-Agent Reinforcement Learning with Masked Auto-Encoder
Authors: Sehyeok Kang, Yongsik Lee, Gahee Kim, Song Chong, Se-Young Yun
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally evaluate our approach on the Starcraft Multi-agent Challenge (SMAC) (Samvelyan et al., 2019), SMACv2 (Ellis et al., 2023), and Google Research Football (GRF) (Kurach et al., 2020) environments. The experimental results consistently demonstrate that MA2E achieves faster convergence and higher sample efficiency compared to fine-tuned QMIX (Hu et al., 2021), which is the state-of-the-art MARL algorithm. Additionally, MA2E shows comparable or superior performance compared to the cases where full observations are provided or communication is employed, substantiating the ability of MA2E to effectively infer full observations from partial observations. |
| Researcher Affiliation | Academia | Sehyeok Kang1 Yongsik Lee1 Gahee Kim1 Song Chong1 Se-Young Yun1 KAIST AI1 {kangsehyeok0329,dldydtlr93,gaheekim,songchong,yunseyoung} @kaist.ac.kr |
| Pseudocode | Yes | B PSEUDOCODE Algorithm 1 Model with Multi-Agent Masked Auto-Encoder (MA2E) Applied |
| Open Source Code | Yes | The code is available at https://github.com/cheesebro329/MA2E |
| Open Datasets | Yes | We conduct experiments in the following environments: Star Craft Multi-Agent Challenge (SMAC) (Samvelyan et al., 2019) from https://github.com/oxwhirl/smac which is licensed under MIT license. SMACv2 (Ellis et al., 2023) from https://github.com/oxwhirl/smacv2 which is licensed under MIT license. Google Research Football (GRF) (Kurach et al., 2020) from https://github.com/google-research/football which is licensed under Apache License 2.0. |
| Dataset Splits | No | The paper evaluates performance on various scenarios within SMAC, SMACv2, and GRF environments, reporting win rates over 2 million time steps. However, it does not specify traditional train/validation/test dataset splits with percentages, sample counts, or explicit splitting methodologies, as these are reinforcement learning environments where data is collected through interaction rather than pre-split datasets. |
| Hardware Specification | Yes | Experiments are carried out on NVIDA A6000 and GTX3090 GPUs and AMD EPYC 7313 CPU. |
| Software Dependencies | No | All algorithms are implemented based on the open-source framework pymarl2 (Hu et al., 2021) from https://github.com/hijkzzz/pymarl2 which is an augmented version of pymarl from https://github.com/oxwhirl/pymarl. Both are licensed under Apache License 2.0. The paper mentions software tools used (pymarl2, pymarl) but does not provide specific version numbers for these or other relevant software dependencies like Python or deep learning frameworks. |
| Experiment Setup | Yes | Table 4: The hyperparameter settings for the baseline algorithms Table 5: The hyperparameter settings for MA2E |