Adversarial Training for Defense Against Label Poisoning Attacks

Authors: Melis Ilayda Bal, Volkan Cevher, Michael Muehlebach

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide a theoretical analysis of our algorithm s convergence properties and empirically evaluate FLORAL s effectiveness across diverse classification tasks. Compared to robust baselines and foundation models such as Ro BERTa, FLORAL consistently achieves higher robust accuracy under increasing attacker budgets. These results underscore the potential of FLORAL to enhance the resilience of machine learning models against label poisoning threats, thereby ensuring robust classification in adversarial settings.
Researcher Affiliation Collaboration 1Max Planck Institute for Intelligent Systems, Tübingen, Germany 2LIONS, EPFL 3AGI Foundations, Amazon EMAIL, EMAIL, EMAIL
Pseudocode Yes Algorithm 1 FLORAL... Algorithm 2 PROJECTIONVIAFIXEDPOINTITERATION... Algorithm 3 FLORAL-Multi Class
Open Source Code Yes Code is available at https://github.com/melisilaydabal/floral.
Open Datasets Yes Moon (Pedregosa et al., 2011): We employed a synthetic benchmark dataset, D = {(xi, yi)}2000 i=1 where xi R2 and yi { 1}... IMDB (Maas et al., 2011): A benchmark review sentiment analysis dataset with D = {(xi, yi)}50000 i=1 where xi R768 and yi { 1}... MNIST (Deng, 2012): In Appendix E.3, we provide the additional experiments with the MNIST dataset in detail.
Dataset Splits Yes We conducted five replications with different train/test splits, including the corresponding adversarial datasets for each dataset... IMDB review sentiment analysis benchmark dataset (Maas et al., 2011) contains train and test datasets, each containing 25, 000 examples. We used randomly selected 20, 000 points from the training set as training examples, and the rest as validation examples.
Hardware Specification Yes We fine-tune the Ro BERTa-base model 1 (Liu et al., 2019) on this dataset and extracted features (768-dimensional embeddings) to train SVM-related models on this dataset. ... using a single NVIDIA A100 40GB GPU.
Software Dependencies No The paper mentions several software components like 'scikit-learn library', 'RoBERTa-base model', 'SGD optimizer', 'AdamW optimizer', but does not provide specific version numbers for these software dependencies, which is required for a 'Yes' answer.
Experiment Setup Yes Experimental setup. For all SVM-based methods, we used RBF kernel, exploring various values of C and γ. We conducted five replications with different train/test splits, including the corresponding adversarial datasets for each dataset. In all FLORAL experiments, we constrain the attacker s capability with a limited budget. That is, the attacker identifies the most influential candidate points, with B = 2k, from the training set and randomly selects k {1, 2, 5, 10, 25} to poison, where k represents the % of points relative to the training set size. Detailed experimental configurations are provided in Appendix C (see Table 3).