reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Multi-Agent Communication with Information Preserving Graph Contrastive Learning

Authors: Wei Du, Shifei Ding, Wei Guo, Yuqing Sun, Guoxian Yu, Lizhen Cui

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To verify the effectiveness of MAIL, we perform a range of experiments across 4 benchmarks: Predator-Prey [Sukhbaatar and Fergus, 2016], Traffic Junction [Sukhbaatar and Fergus, 2016], Battle [Zheng et al., 2018], Star Craft Multi-Agent Challenge [Vinyals et al., 2019]. Experiments are conducted with a GPU NVIDIA RTX 4090. The hyperparameters that we adjust are as follows: (i) k {3, 5, 10}, for k nearest neighbors, (ii) aggregation hops l {3, 5, 7}, (iii) λ1 = 0.2, λ2 = 0.3, and β = 0.2 depending on the experimental results. For each environment, 4 GNN-based MARL baselines (introduced in Related Work) have been chosen for ease of comparison without losing generality. The detailed hyperparameters and some experiments are given in the Appendix.
Researcher Affiliation	Academia	Wei Du1,2 , Shifei Ding3 , Wei Guo1,2 , Yuqing Sun1 , Guoxian Yu1,2, and Lizhen Cui1,2, 1School of Software, Shandong University, China 2Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, China 3School of Computer Science and Technology, China University of Mining and Technology, China EMAIL, EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1 MAIL 1: Initialize: the parameters of networks, the maximum size of the replay buffer, and the frequency of network updating. 2: for each timestep t T do 3: for each agent i N do 4: // During the decentralized execution period 5: Generate agent feature xi by GRU and MLP 6: Construct graph G = (V, E, X) based on xi 7: Receive node representations Ho, Hf, Hr, and Ht 8: Calculate feature loss Lf, topological loss Lt and, cross-module loss Lc with Eq.6, Eq.8, and Eq.9, respectively 9: Update parameters according to the overall GCL objective loss LGCL in Eq.10 10: Obtain final message representation ho i 11: Calculate action-value Qi based on hi and τi 12: at i π (Qi) (ϵ greed ) 13: Store τi and at i to replay buffer 14: // During centralized training period 15: Fed Qi to mixing network and obtain Qtot 16: Minimize loss function according to Eq.12 17: Update weights of all networks 18: end for 19: end for
Open Source Code	No	The paper does not explicitly state that source code for the described methodology is being released, nor does it provide a link to a code repository.
Open Datasets	Yes	To verify the effectiveness of MAIL, we perform a range of experiments across 4 benchmarks: Predator-Prey [Sukhbaatar and Fergus, 2016], Traffic Junction [Sukhbaatar and Fergus, 2016], Battle [Zheng et al., 2018], Star Craft Multi-Agent Challenge [Vinyals et al., 2019].
Dataset Splits	No	The paper describes the configurations of the multi-agent reinforcement learning environments (e.g., "a 10 10 grid with 5 predators", "Nc= 10, p = 0.2"), which define the operational parameters of the simulation. However, it does not provide explicit training/test/validation splits for a fixed dataset, as is common in supervised learning. For RL, data is generated through interaction with the environment.
Hardware Specification	Yes	Experiments are conducted with a GPU NVIDIA RTX 4090.
Software Dependencies	No	The paper does not specify version numbers for any key software components or libraries (e.g., Python, PyTorch, TensorFlow, specific game engines/simulators).
Experiment Setup	Yes	The hyperparameters that we adjust are as follows: (i) k {3, 5, 10}, for k nearest neighbors, (ii) aggregation hops l {3, 5, 7}, (iii) λ1 = 0.2, λ2 = 0.3, and β = 0.2 depending on the experimental results. For each environment, 4 GNN-based MARL baselines (introduced in Related Work) have been chosen for ease of comparison without losing generality. The detailed hyperparameters and some experiments are given in the Appendix.