Protecting against simultaneous data poisoning attacks
Authors: Neel Alex, Muhammad Shoaib Ahmed Siddiqui, Amartya Sanyal, David Krueger
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate that multiple backdoors can be simultaneously installed... Furthermore, we show that existing backdoor defense methods do not effectively defend... Finally, we leverage insights... to develop a new defense, Ba DLoss (Backdoor Detection via Loss Dynamics), that is effective in the multi-attack setting. With minimal clean accuracy degradation, Ba DLoss attains an average attack success rate in the multi-attack setting of 7.98% on CIFAR10, 10.29% on GTSRB, and 19.17% on Imagenette, compared to the average of other defenses at 63.44%, 74.83%, and 41.74% respectively. Ba DLoss scales to Image Net-1k, reducing the average attack success rate from 88.57% to 15.61%. |
| Researcher Affiliation | Academia | Neel Alex University of Cambridge Shoaib Ahmed Siddiqui University of Cambridge Amartya Sanyal Department of Computer Science, University of Copenhagen David Krueger Mila, University of Montreal |
| Pseudocode | Yes | A BADLOSS PSEUDOCODE Algorithm 1: Py Torch pseudocode for Ba DLoss. |
| Open Source Code | Yes | We open-source our code to aid replication and further study, available on Git Hub: https://github.com/shoaibahmed/badloss/ |
| Open Datasets | Yes | We use the standard computer vision datasets: CIFAR-10 (Krizhevsky, 2009), GTSRB (Houben et al., 2013), and Imagenette (Howard, 2019)... Image Net-1k (Deng et al., 2009). |
| Dataset Splits | Yes | Additionally, we assume that the defender has access to a small set of guaranteed clean examples (250 examples in our case)... Attack success rate (ASR) is evaluated on the full test set excluding the target class... The overall fraction of the dataset which is poisoned is approximately 8% on CIFAR-10, 10% on GTSRB, and 9% on Imagenette. |
| Hardware Specification | No | The paper mentions the use of Res Net-50 and Res Net-18 architectures but does not specify any hardware details like GPU models, CPU models, or memory specifications used for the experiments. |
| Software Dependencies | No | We train using Py Torch (Ansel et al., 2024). Our nearest neighbors classifier uses scikit-learn (Pedregosa et al., 2011). Plots were generated with Matplotlib (Hunter, 2007). While software names are mentioned, specific version numbers for their usage are not provided. |
| Experiment Setup | Yes | We use the Adam W optimizer with learning rate γ = 1e 3 (with a cosine-annealing learning rate schedule), weight decay λ = 1e 4, and β1, β2 = 0.9, 0.999. We train for 100 epochs on CIFAR-10 and GTSRB, and 250 epochs on Imagenette... In CIFAR-10, we use a batch size of 128. In GTSRB, Imagenette, and Image Net, we use a batch size of 256. In CIFAR-10, we use a crop-and-pad (4px max) and random horizontal flip augmentation... In GTSRB, we use no augmentations. In Imagenette and Image Net, we use random resized crop (scale=0.08-1) and random flip augmentations. |