What is the Solution for State-Adversarial Multi-Agent Reinforcement Learning?
Authors: Songyang Han, Sanbao Su, Sihong He, Shuo Han, Haizhao Yang, Shaofeng Zou, Fei Miao
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments demonstrate that our algorithm outperforms existing methods when faced with state perturbations and greatly improves the robustness of MARL policies. |
| Researcher Affiliation | Collaboration | Songyang Han EMAIL School of Computing University of Connecticut Sony AI |
| Pseudocode | Yes | Algorithm 1: Robust Multi-Agent Adversarial Actor-Critic (RMA3C) Algorithm |
| Open Source Code | Yes | Our code is public on https://songyanghan.github.io/what_is_solution/. |
| Open Datasets | Yes | To demonstrate the effectiveness of our algorithm, we utilize the multi-agent particle environments developed in Lowe et al. (2017) which consist of multiple agents and landmarks in a 2D world. |
| Dataset Splits | Yes | For both training and testing, we report statistics that are averaged across 10 runs in each scenario and algorithm. ... The mean episode rewards are averaged across 2000 episodes and 10 test runs in each environment. |
| Hardware Specification | Yes | The host machine adopted in our experiments is a server configured with AMD Ryzen Threadripper 2990WX 32-core processors and four Quadro RTX 6000 GPUs. |
| Software Dependencies | Yes | Our experiments are performed on Python 3.5.4, Gym 0.10.5, Numpy 1.14.5, Tensorflow 1.8.0, and CUDA 9.0. |
| Experiment Setup | Yes | Table 3: Hyperparameters for our RMA3C algorithm and the baselines. |