Towards Understanding Adversarial Transferability in Federated Learning
Authors: Yijiang Li, Ying Gao, Haohan Wang
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically show that the proposed attack imposes a high security risk to current FL systems. By using only 3% of the client s data, we achieve the highest attack rate of over 80%. To further offer a full understanding of the challenges the FL system faces in transferable attacks, we provide a comprehensive analysis over the transfer robustness of FL across a spectrum of configurations. Surprisingly, FL systems show a higher level of robustness than their centralized counterparts, especially when both systems are equally good at handling regular, non-malicious data. Both practical experiments and theoretical analysis support our conclusions. |
| Researcher Affiliation | Academia | Yijiang Li EMAIL University of California San Diego Ying Gao EMAIL South China University of Technology Haohan Wang EMAIL University of Illinois Urbana-Champaign |
| Pseudocode | Yes | Algorithm 1: Our Attack as Benign Setting Input: A set of N clients {C1, C2, . . . , CN} where M clients {C1, C2, . . . , CM} are corrupted by attacker, datasets Xi for each client Ci, sampling rate K, total communication round T, local iteration number E and surrogate model training iteration number Ts. Output: Adversarial perturbation δ on example x |
| Open Source Code | No | The paper does not provide explicit links to source code or statements of code release. |
| Open Datasets | Yes | We use PGD (Madry et al., 2017) with 10 iterations, no random restart, and an epsilon of 8 / 255 over L norm on CIFAR10. For experiments on Image Net200, we use PGD (Madry et al., 2017) of the same setup but with an epsilon of 2 / 255. We further perform experiments on CIFAR100, SVHN and a much larger and more realistic dataset Image Net200, as demonstrated in (a) of Fig. 5 where similar trends are demonstrated. |
| Dataset Splits | Yes | We split the datatset into 100 partitions of equal size in an iid fashion. We provide two approaches to control the heterogeneity of the distributed data. 1. Change the number of maximum classes per client (Mc Mahan et al., 2017). 2. Change the alpha value of Dirichlet distributions (α) (Wang et al., 2020; Yurochkin et al., 2019) (smaller α means a more non-iid partition) and the log-normal variance (sgm) of the Log-Normal distribution (larger variance denotes more unbalanced partitions) used in unbalanced partition (Zeng et al., 2021). We leverage the Fed Lab framework to generate the different partitions (Zeng et al., 2021). |
| Hardware Specification | Yes | All experiments are conducted with one RTX 2080Ti GPU. |
| Software Dependencies | No | The paper mentions using 'Fed Lab framework' but does not specify software dependencies with version numbers. Other mentions are methods (PGD, Auto Attack, skip attack) or models (Res Net50, CNN). |
| Experiment Setup | Yes | For the federated model, we use SGD without momentum, weight decay of 1e-3, learning rate of 0.1, and local batch size of 50 following (Acar et al., 2021). We follow the cross-device setting and use a subset of clients in each round of training. We use a 10% as default where not specified. We train locally 5 epochs on Res Net50 and 1 epoch on CNN. For centralized and source model training, we leverage SGD with a momentum of 0.9, weight decay of 1e-3, a learning rate of 0.01 and batch size of 64. For adversarial training, we use the same setting as centralized and leverage PGD to perform the adversarial training. We refer to (Zizzo et al., 2020) for the details of adversarial training. |