What is the Solution for State-Adversarial Multi-Agent Reinforcement Learning?

Authors: Songyang Han, Sanbao Su, Sihong He, Shuo Han, Haizhao Yang, Shaofeng Zou, Fei Miao

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments demonstrate that our algorithm outperforms existing methods when faced with state perturbations and greatly improves the robustness of MARL policies.
Researcher Affiliation Collaboration Songyang Han EMAIL School of Computing University of Connecticut Sony AI
Pseudocode Yes Algorithm 1: Robust Multi-Agent Adversarial Actor-Critic (RMA3C) Algorithm
Open Source Code Yes Our code is public on https://songyanghan.github.io/what_is_solution/.
Open Datasets Yes To demonstrate the effectiveness of our algorithm, we utilize the multi-agent particle environments developed in Lowe et al. (2017) which consist of multiple agents and landmarks in a 2D world.
Dataset Splits Yes For both training and testing, we report statistics that are averaged across 10 runs in each scenario and algorithm. ... The mean episode rewards are averaged across 2000 episodes and 10 test runs in each environment.
Hardware Specification Yes The host machine adopted in our experiments is a server configured with AMD Ryzen Threadripper 2990WX 32-core processors and four Quadro RTX 6000 GPUs.
Software Dependencies Yes Our experiments are performed on Python 3.5.4, Gym 0.10.5, Numpy 1.14.5, Tensorflow 1.8.0, and CUDA 9.0.
Experiment Setup Yes Table 3: Hyperparameters for our RMA3C algorithm and the baselines.