Novelty-Guided Data Reuse for Efficient and Diversified Multi-Agent Reinforcement Learning
Authors: Yangkun Chen, Kai Yang, Jian Tao, Jiafei Lyu
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our evaluations confirm substantial improvements in MARL effectiveness in complex cooperative scenarios such as Google Research Football and super-hard Star Craft II micromanagement tasks. ... We recorded the test win rates of each method on various tasks and compared the final performance and convergence rates of different methods. We plotted win rate curves of different methods under various task environments for comparison, as shown in Figure 3. ... Ablation Study In this section, we will verify questions (3) and (4). |
| Researcher Affiliation | Academia | Yangkun Chen*, Kai Yang*, Jian Tao, Jiafei Lyu Shenzhen International Graduate School, Tsinghua University EMAIL |
| Pseudocode | No | The paper describes the MANGER framework and its update formulas (equations 1-10) in detail, but it does not present any explicitly labeled pseudocode or algorithm blocks with structured, numbered steps. |
| Open Source Code | Yes | Code https://github.com/kkane99/MANGER code |
| Open Datasets | Yes | We employed the widely used Star Craft Multi-Agent Challenge (SMAC, (Samvelyan et al. 2019)) in multi-agent reinforcement learning. ... We also used the Google Research Football (GRF) (Kurach et al. 2020) environment, which contains numerous multi-agent tasks |
| Dataset Splits | No | The paper mentions evaluating on 'a variety of tasks' for SMAC and 'three more challenging settings' for GRF, and discusses 'test win rates' and 'comparison of training time and the final results'. However, it does not provide specific details on how the data within these environments is split into training, validation, or test sets (e.g., percentages, sample counts, or references to predefined splits). |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU models, CPU models, or memory specifications used for running the experiments. |
| Software Dependencies | No | We utilized Py MARL2 (Hu et al. 2021) as our codebase and employed QMIX as our baseline algorithm. While PyMARL2 is mentioned as a codebase, no specific version numbers are provided for PyMARL2 or any other software libraries or frameworks. |
| Experiment Setup | Yes | In this study, we set α = 2 and observe that the mean number of extra updates is less than 0.5, which does not significantly increase the training time. ... We compared our approach against several popular methods, including QMIX, QPLEX, and Qatten, using the parameters recommended in the respective papers. |