Novelty-Guided Data Reuse for Efficient and Diversified Multi-Agent Reinforcement Learning

Authors: Yangkun Chen, Kai Yang, Jian Tao, Jiafei Lyu

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our evaluations confirm substantial improvements in MARL effectiveness in complex cooperative scenarios such as Google Research Football and super-hard Star Craft II micromanagement tasks. ... We recorded the test win rates of each method on various tasks and compared the final performance and convergence rates of different methods. We plotted win rate curves of different methods under various task environments for comparison, as shown in Figure 3. ... Ablation Study In this section, we will verify questions (3) and (4).
Researcher Affiliation Academia Yangkun Chen*, Kai Yang*, Jian Tao, Jiafei Lyu Shenzhen International Graduate School, Tsinghua University EMAIL
Pseudocode No The paper describes the MANGER framework and its update formulas (equations 1-10) in detail, but it does not present any explicitly labeled pseudocode or algorithm blocks with structured, numbered steps.
Open Source Code Yes Code https://github.com/kkane99/MANGER code
Open Datasets Yes We employed the widely used Star Craft Multi-Agent Challenge (SMAC, (Samvelyan et al. 2019)) in multi-agent reinforcement learning. ... We also used the Google Research Football (GRF) (Kurach et al. 2020) environment, which contains numerous multi-agent tasks
Dataset Splits No The paper mentions evaluating on 'a variety of tasks' for SMAC and 'three more challenging settings' for GRF, and discusses 'test win rates' and 'comparison of training time and the final results'. However, it does not provide specific details on how the data within these environments is split into training, validation, or test sets (e.g., percentages, sample counts, or references to predefined splits).
Hardware Specification No The paper does not provide any specific hardware details such as GPU models, CPU models, or memory specifications used for running the experiments.
Software Dependencies No We utilized Py MARL2 (Hu et al. 2021) as our codebase and employed QMIX as our baseline algorithm. While PyMARL2 is mentioned as a codebase, no specific version numbers are provided for PyMARL2 or any other software libraries or frameworks.
Experiment Setup Yes In this study, we set α = 2 and observe that the mean number of extra updates is less than 0.5, which does not significantly increase the training time. ... We compared our approach against several popular methods, including QMIX, QPLEX, and Qatten, using the parameters recommended in the respective papers.