reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Towards Efficient Collaboration via Graph Modeling in Reinforcement Learning

Authors: Wenzhe Fan, Zishun Yu, Chengdong Ma, Changye Li, Yaodong Yang, Xinhua Zhang

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical results in networked systems such as trafﬁc scheduling and power control demonstrate that f-MAT achieves superior performance compared to strong baselines, thereby paving the way for handling complex collaborative problems. We evaluate the performance and efﬁciency of f-MAT in grid alignment, trafﬁc scheduling, and power control. Empirical results demonstrate that f-MAT fulﬁlls the efﬁcient collaboration compared to other baselines, paving the way for efﬁcient collaboration in multi-agent systems.
Researcher Affiliation	Academia	Wenzhe Fan1 , Zishun Yu1, Chengdong Ma2, Changye Li3, Yaodong Yang2, Xinhua Zhang1 1 University of Illinois Chicago 2 Institute for Artiﬁcial Intelligence, Peking University 3 Yuanpei College, Peking University EMAIL, EMAIL, EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode	Yes	The pseudo code of f-MAT can be found in Appendix A. The complete pseudocode for f-MAT’s encoder and decoder can be found in Algorithm 1 in Appendix A. The method is detailed in Algorithm 3 in Appendix A.
Open Source Code	No	No explicit statement about open-source code or repository links was found.
Open Datasets	Yes	Our ﬁrst experiment is on a simpliﬁed domain of trafﬁc ﬂow (Zhang, Aberdeen, and Vishwanathan 2007), called Grid Sim... The second environment adapted the Simulation of Urban Mobility (SUMO, Chen et al. 2020; Ault and Sharon 2021)... We have two microgrid systems (Chen et al. 2021)...
Dataset Splits	No	The paper describes the experimental environments and their characteristics, but it does not specify any training, validation, or test dataset splits (e.g., percentages or sample counts).
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory amounts used for running the experiments.
Software Dependencies	No	The paper does not specify any software dependencies with version numbers (e.g., Python version, library versions, or specific solver versions) used in the experiments.
Experiment Setup	Yes	As shown in Fig. 6a, Lenc = 3 produces the most stable trend and achieves the highest reward. Based on the above experiments, we recommend setting Lenc = 3, which we used to produce our main results. To explore the relationship between Lenc and group size, we use the optimality gap, the value between the true optimal reward and the learned reward achieved by the algorithm, to illustrate the variations.