Adversarial Training for Defense Against Label Poisoning Attacks
Authors: Melis Ilayda Bal, Volkan Cevher, Michael Muehlebach
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide a theoretical analysis of our algorithm s convergence properties and empirically evaluate FLORAL s effectiveness across diverse classification tasks. Compared to robust baselines and foundation models such as Ro BERTa, FLORAL consistently achieves higher robust accuracy under increasing attacker budgets. These results underscore the potential of FLORAL to enhance the resilience of machine learning models against label poisoning threats, thereby ensuring robust classification in adversarial settings. |
| Researcher Affiliation | Collaboration | 1Max Planck Institute for Intelligent Systems, Tübingen, Germany 2LIONS, EPFL 3AGI Foundations, Amazon EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1 FLORAL... Algorithm 2 PROJECTIONVIAFIXEDPOINTITERATION... Algorithm 3 FLORAL-Multi Class |
| Open Source Code | Yes | Code is available at https://github.com/melisilaydabal/floral. |
| Open Datasets | Yes | Moon (Pedregosa et al., 2011): We employed a synthetic benchmark dataset, D = {(xi, yi)}2000 i=1 where xi R2 and yi { 1}... IMDB (Maas et al., 2011): A benchmark review sentiment analysis dataset with D = {(xi, yi)}50000 i=1 where xi R768 and yi { 1}... MNIST (Deng, 2012): In Appendix E.3, we provide the additional experiments with the MNIST dataset in detail. |
| Dataset Splits | Yes | We conducted five replications with different train/test splits, including the corresponding adversarial datasets for each dataset... IMDB review sentiment analysis benchmark dataset (Maas et al., 2011) contains train and test datasets, each containing 25, 000 examples. We used randomly selected 20, 000 points from the training set as training examples, and the rest as validation examples. |
| Hardware Specification | Yes | We fine-tune the Ro BERTa-base model 1 (Liu et al., 2019) on this dataset and extracted features (768-dimensional embeddings) to train SVM-related models on this dataset. ... using a single NVIDIA A100 40GB GPU. |
| Software Dependencies | No | The paper mentions several software components like 'scikit-learn library', 'RoBERTa-base model', 'SGD optimizer', 'AdamW optimizer', but does not provide specific version numbers for these software dependencies, which is required for a 'Yes' answer. |
| Experiment Setup | Yes | Experimental setup. For all SVM-based methods, we used RBF kernel, exploring various values of C and γ. We conducted five replications with different train/test splits, including the corresponding adversarial datasets for each dataset. In all FLORAL experiments, we constrain the attacker s capability with a limited budget. That is, the attacker identifies the most influential candidate points, with B = 2k, from the training set and randomly selects k {1, 2, 5, 10, 25} to poison, where k represents the % of points relative to the training set size. Detailed experimental configurations are provided in Appendix C (see Table 3). |