Efficient Communication in Multi-Agent Reinforcement Learning with Implicit Consensus Generation

Authors: Dapeng Li, Na Lou, Zhiwei Xu, Bin Zhang, Guoliang Fan

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluated the performance of COCOM in various multi-agent scenarios, including Hallway Task (Wang et al. 2020), Star Craft Multi-Agent Challenge (Samvelyan et al. 2019) and Google Research Football (Kurach et al. 2020). The experimental results demonstrate that COCOM achieves superior performance with lower communication costs. Additionally, the migration of COCOM into different existing frameworks demonstrates its generalizable improvement effects across various original algorithms.
Researcher Affiliation Academia Dapeng Li1,2, Na Lou1,2, Zhiwei Xu3, Bin Zhang 1,2, Guoliang Fan*1,2 1The Key Laboratory of Cognition and Decision Intelligence for Complex Systems, Institute of Automation, Chinese Academy of Sciences 2School of Artificial Intelligence, University of Chinese Academy of Sciences 3School of Artificial Intelligence, Shandong University EMAIL,EMAIL
Pseudocode No The paper describes its methodology in the 'Method' section using descriptive text and mathematical equations, without including any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not contain any explicit statements regarding the release of source code or provide a link to a code repository.
Open Datasets Yes We evaluated the performance of COCOM in various multi-agent scenarios, including Hallway Task (Wang et al. 2020), Star Craft Multi-Agent Challenge (Samvelyan et al. 2019) and Google Research Football (Kurach et al. 2020).
Dataset Splits No The paper uses multi-agent environments like Hallway, Star Craft Multi-Agent Challenge (SMAC), and Google Research Football (GRF) which are simulation environments rather than fixed datasets with explicit training/test/validation splits. While it mentions evaluating 'Test Win Rate%', it does not specify how the experience generated from these environments is partitioned into training, validation, or test sets in terms of sample counts or percentages. For the Hallway task, it specifies parameters j=2, k=6, and l=10.
Hardware Specification No The paper does not provide any specific details regarding the hardware (e.g., GPU models, CPU types) used for conducting the experiments.
Software Dependencies No The paper mentions 'Py MARL2 (Hu et al. 2023)' for fine-tuning baseline parameters but does not provide specific version numbers for any software dependencies (e.g., programming languages, libraries, or frameworks) used for implementing the COCOM algorithm.
Experiment Setup No The paper states, 'We adopt the fine-tuned parameters in Py MARL2 (Hu et al. 2023) to ensure the performance of baseline algorithms,' but does not provide specific hyperparameter values (such as learning rate, batch size, or number of epochs) or other system-level training settings for its own proposed method or the baselines within the main text.