reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Robust Multi-Agent Reinforcement Learning with State Uncertainty

Authors: Sihong He, Songyang Han, Sanbao Su, Shuo Han, Shaofeng Zou, Fei Miao

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments show that the proposed RMAQ algorithm converges to the optimal value function; our RMAAC algorithm outperforms several MARL and robust MARL methods in multiple multi-agent environments when state uncertainty is present. The source code is public on https://github.com/sihongho/robust_marl_with_state_uncertainty.
Researcher Affiliation	Academia	Sihong He EMAIL Department of Computer Science and Engineering University of Connecticut Songyang Han EMAIL Department of Computer Science and Engineering University of Connecticut Sanbao Su EMAIL Department of Computer Science and Engineering University of Connecticut Shuo Han EMAIL Department of Electrical and Computer Engineering University of Illinois, Chicago Shaofeng Zou EMAIL Department of Electrical Engineering University at Buffalo, The State University of New York Fei Miao EMAIL Department of Computer Science and Engineering University of Connecticut
Pseudocode	Yes	Algorithm 1: RMAAC with deterministic policies
Open Source Code	Yes	The source code is public on https://github.com/sihongho/robust_marl_with_state_uncertainty.
Open Datasets	Yes	We run experiments in several benchmark multi-agent scenarios, based on the multi-agent particle environments (MPE) (Lowe et al., 2017).
Dataset Splits	No	The paper mentions the duration of testing: "The testing step is chosen as 10000 and each episode contains 25 steps." However, it does not provide specific train/validation/test dataset splits (e.g., percentages, sample counts, or citations to predefined splits) for reproducing the data partitioning.
Hardware Specification	Yes	The host machine used in our experiments is a server configured with AMD Ryzen Threadripper 2990WX 32-core processors and four Quadro RTX 6000 GPUs.
Software Dependencies	Yes	All experiments are performed on Python 3.5.4, Gym 0.10.5, Numpy 1.14.5, Tensorflow 1.8.0, and CUDA 9.0.
Experiment Setup	Yes	The hyper-parameters used to train RMAAC and the baseline algorithms are summarized in Appendix C.2.2, Table 4.