Single-agent Poisoning Attacks Suffice to Ruin Multi-Agent Learning
Authors: Fan Yao, Yuwei Cheng, Ermin Wei, Haifeng Xu
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our theoretical and empirical results reveal an intrinsic efficiency-robustness trade-off: the faster an algorithm converges, the more vulnerable it becomes to utility poisoning attacks. To the best of our knowledge, this is the first work to identify and characterize such a trade-off in the context of multi-agent learning. |
| Researcher Affiliation | Academia | Fan Yao Department of Computer Science University of Virginia Charlottesville, VA 22904, USA EMAIL Yuwei Cheng Department of Statistics University of Chicago Chicago, IL 60637, USA EMAIL Ermin Wei Electrical and Computer Engineering Industrial Engineering and Management Sciences Northwestern University EMAIL Haifeng Xu Department of Computer Science University of Chicago EMAIL |
| Pseudocode | Yes | Algorithm 1: MD-SCB under Utility Poisoning Attack (Ba et al., 2024) Algorithm 2: Single Agent Mirror Descent Self-Concordant Barrier with Bandit Feedback under Utility Shifting Attack Algorithm 3: MAMD (Bravo et al., 2018) under Utility Shifting Attack |
| Open Source Code | No | The paper does not explicitly state that the authors' code for the methodology described is open-source or provide a link to a code repository. It mentions 'Smarts: An open-source scalable multi-agent rl training school for autonomous driving' in the references, but this refers to a third-party tool. |
| Open Datasets | No | The paper conducts experiments on simulated game instances (n-person Cournot games, Tullock contests) by defining parameters (e.g., a=10, b=0.05, ci=1). It does not use or provide concrete access information for a publicly available or open dataset. |
| Dataset Splits | No | The paper does not utilize external datasets for experiments, but rather simulated game instances. Therefore, the concept of training/test/validation splits for data is not applicable in this context. It defines game parameters for simulation. |
| Hardware Specification | No | The paper describes the parameters of the simulated game instances and the algorithms used, but it does not specify any details regarding the hardware (e.g., GPU models, CPU types, memory) used to run the experiments. |
| Software Dependencies | No | The paper refers to algorithms like MD-SCB and MAMD and discusses their implementation, but it does not provide specific software names with version numbers (e.g., Python, PyTorch, TensorFlow versions, or other libraries/solvers with their versions) that were used for the implementation. |
| Experiment Setup | Yes | Experimental Setup: For the n-person Cournot game instance, we set a = 10, b = 0.05, and the cost ci = 1 for all agents i [n]. The action space is set to Xi = [0, 50] and the unique NE of such games can be verified as x i = 180 n+1, i [n]. We let each agent run MD-SCB with tha game size n {10, 50, 100}. The learning rate schedule of MD-SCB is set to (ηt t ϕ, ϕ = 0.5, 0.7, 0.9), corresponding to convergence rates α = 0.25, 0.15, 0.05, p = 2 when no attack is present. We implement SUSA against agent 1, with a fixed δ = 10.0. |