Single-agent Poisoning Attacks Suffice to Ruin Multi-Agent Learning

Authors: Fan Yao, Yuwei Cheng, Ermin Wei, Haifeng Xu

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our theoretical and empirical results reveal an intrinsic efficiency-robustness trade-off: the faster an algorithm converges, the more vulnerable it becomes to utility poisoning attacks. To the best of our knowledge, this is the first work to identify and characterize such a trade-off in the context of multi-agent learning.
Researcher Affiliation Academia Fan Yao Department of Computer Science University of Virginia Charlottesville, VA 22904, USA EMAIL Yuwei Cheng Department of Statistics University of Chicago Chicago, IL 60637, USA EMAIL Ermin Wei Electrical and Computer Engineering Industrial Engineering and Management Sciences Northwestern University EMAIL Haifeng Xu Department of Computer Science University of Chicago EMAIL
Pseudocode Yes Algorithm 1: MD-SCB under Utility Poisoning Attack (Ba et al., 2024) Algorithm 2: Single Agent Mirror Descent Self-Concordant Barrier with Bandit Feedback under Utility Shifting Attack Algorithm 3: MAMD (Bravo et al., 2018) under Utility Shifting Attack
Open Source Code No The paper does not explicitly state that the authors' code for the methodology described is open-source or provide a link to a code repository. It mentions 'Smarts: An open-source scalable multi-agent rl training school for autonomous driving' in the references, but this refers to a third-party tool.
Open Datasets No The paper conducts experiments on simulated game instances (n-person Cournot games, Tullock contests) by defining parameters (e.g., a=10, b=0.05, ci=1). It does not use or provide concrete access information for a publicly available or open dataset.
Dataset Splits No The paper does not utilize external datasets for experiments, but rather simulated game instances. Therefore, the concept of training/test/validation splits for data is not applicable in this context. It defines game parameters for simulation.
Hardware Specification No The paper describes the parameters of the simulated game instances and the algorithms used, but it does not specify any details regarding the hardware (e.g., GPU models, CPU types, memory) used to run the experiments.
Software Dependencies No The paper refers to algorithms like MD-SCB and MAMD and discusses their implementation, but it does not provide specific software names with version numbers (e.g., Python, PyTorch, TensorFlow versions, or other libraries/solvers with their versions) that were used for the implementation.
Experiment Setup Yes Experimental Setup: For the n-person Cournot game instance, we set a = 10, b = 0.05, and the cost ci = 1 for all agents i [n]. The action space is set to Xi = [0, 50] and the unique NE of such games can be verified as x i = 180 n+1, i [n]. We let each agent run MD-SCB with tha game size n {10, 50, 100}. The learning rate schedule of MD-SCB is set to (ηt t ϕ, ϕ = 0.5, 0.7, 0.9), corresponding to convergence rates α = 0.25, 0.15, 0.05, p = 2 when no attack is present. We implement SUSA against agent 1, with a fixed δ = 10.0.