reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

What is the Solution for State-Adversarial Multi-Agent Reinforcement Learning?

Authors: Songyang Han, Sanbao Su, Sihong He, Shuo Han, Haizhao Yang, Shaofeng Zou, Fei Miao

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments demonstrate that our algorithm outperforms existing methods when faced with state perturbations and greatly improves the robustness of MARL policies.
Researcher Affiliation	Collaboration	Songyang Han EMAIL School of Computing University of Connecticut Sony AI
Pseudocode	Yes	Algorithm 1: Robust Multi-Agent Adversarial Actor-Critic (RMA3C) Algorithm
Open Source Code	Yes	Our code is public on https://songyanghan.github.io/what_is_solution/.
Open Datasets	Yes	To demonstrate the effectiveness of our algorithm, we utilize the multi-agent particle environments developed in Lowe et al. (2017) which consist of multiple agents and landmarks in a 2D world.
Dataset Splits	Yes	For both training and testing, we report statistics that are averaged across 10 runs in each scenario and algorithm. ... The mean episode rewards are averaged across 2000 episodes and 10 test runs in each environment.
Hardware Specification	Yes	The host machine adopted in our experiments is a server configured with AMD Ryzen Threadripper 2990WX 32-core processors and four Quadro RTX 6000 GPUs.
Software Dependencies	Yes	Our experiments are performed on Python 3.5.4, Gym 0.10.5, Numpy 1.14.5, Tensorflow 1.8.0, and CUDA 9.0.
Experiment Setup	Yes	Table 3: Hyperparameters for our RMA3C algorithm and the baselines.