Efficient Communication in Multi-Agent Reinforcement Learning with Implicit Consensus Generation
Authors: Dapeng Li, Na Lou, Zhiwei Xu, Bin Zhang, Guoliang Fan
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluated the performance of COCOM in various multi-agent scenarios, including Hallway Task (Wang et al. 2020), Star Craft Multi-Agent Challenge (Samvelyan et al. 2019) and Google Research Football (Kurach et al. 2020). The experimental results demonstrate that COCOM achieves superior performance with lower communication costs. Additionally, the migration of COCOM into different existing frameworks demonstrates its generalizable improvement effects across various original algorithms. |
| Researcher Affiliation | Academia | Dapeng Li1,2, Na Lou1,2, Zhiwei Xu3, Bin Zhang 1,2, Guoliang Fan*1,2 1The Key Laboratory of Cognition and Decision Intelligence for Complex Systems, Institute of Automation, Chinese Academy of Sciences 2School of Artificial Intelligence, University of Chinese Academy of Sciences 3School of Artificial Intelligence, Shandong University EMAIL,EMAIL |
| Pseudocode | No | The paper describes its methodology in the 'Method' section using descriptive text and mathematical equations, without including any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statements regarding the release of source code or provide a link to a code repository. |
| Open Datasets | Yes | We evaluated the performance of COCOM in various multi-agent scenarios, including Hallway Task (Wang et al. 2020), Star Craft Multi-Agent Challenge (Samvelyan et al. 2019) and Google Research Football (Kurach et al. 2020). |
| Dataset Splits | No | The paper uses multi-agent environments like Hallway, Star Craft Multi-Agent Challenge (SMAC), and Google Research Football (GRF) which are simulation environments rather than fixed datasets with explicit training/test/validation splits. While it mentions evaluating 'Test Win Rate%', it does not specify how the experience generated from these environments is partitioned into training, validation, or test sets in terms of sample counts or percentages. For the Hallway task, it specifies parameters j=2, k=6, and l=10. |
| Hardware Specification | No | The paper does not provide any specific details regarding the hardware (e.g., GPU models, CPU types) used for conducting the experiments. |
| Software Dependencies | No | The paper mentions 'Py MARL2 (Hu et al. 2023)' for fine-tuning baseline parameters but does not provide specific version numbers for any software dependencies (e.g., programming languages, libraries, or frameworks) used for implementing the COCOM algorithm. |
| Experiment Setup | No | The paper states, 'We adopt the fine-tuned parameters in Py MARL2 (Hu et al. 2023) to ensure the performance of baseline algorithms,' but does not provide specific hyperparameter values (such as learning rate, batch size, or number of epochs) or other system-level training settings for its own proposed method or the baselines within the main text. |