Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Catastrophic overfitting can be induced with discriminative non-robust features
Authors: Guillermo Ortiz-Jimenez, Pau de Jorge, Amartya Sanyal, Adel Bibi, Puneet K. Dokania, Pascal Frossard, Grégory Rogez, Philip Torr
TMLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this work, we study the onset of CO in single-step AT methods through controlled modifications of typical datasets of natural images. In particular, we show that CO can be induced at much smaller ϵ values than it was observed before just by injecting images with seemingly innocuous features. Through extensive experiments we analyze this novel phenomenon and discover that the presence of these easy features induces a learning shortcut that leads to CO. Our findings provide new insights into the mechanisms of CO and improve our understanding of the dynamics of AT. |
| Researcher Affiliation | Collaboration | Guillermo Ortiz-Jimenez EMAIL Ecole Polytechnique Fédérale de Lausanne Pau de Jorge EMAIL University of Oxford Naver Labs Europe Amartya Sanyal EMAIL ETH Zürich Max Planck Institute for Intelligent Systems, Tuebingen Adel Bibi EMAIL University of Oxford Puneet K. Dokania EMAIL University of Oxford Five AI Ltd. Pascal Frossard EMAIL Ecole Polytechnique Fédérale de Lausanne Grégory Rogez EMAIL Naver Labs Europe |
| Pseudocode | No | The paper describes methods and concepts through mathematical formulations and textual descriptions but does not include any explicitly labeled pseudocode blocks or algorithms. |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code for the described methodology, nor does it provide a link to a code repository. |
| Open Datasets | Yes | We train a Pre Act Res Net18 (He et al., 2016) on different intervened versions of CIFAR-10 (Krizhevsky & Hinton, 2009) using FGSM-AT for different robustness budgets ϵ and scales β. Similarly to Section 3 we modify the SVHN, CIFAR-100 and higher resolution Imagenet-100 and Tiny Imagenet datasets to inject highly discriminative features v(y). |
| Dataset Splits | No | The paper refers to using specific datasets (e.g., CIFAR-10, SVHN) and mentions training and testing, but it does not explicitly provide details about the splits used (e.g., percentages for training/validation/test sets) or cite a specific standard split setup for reproducibility. |
| Hardware Specification | Yes | All our experiments were performed using a cluster equipped with GPUs of various architectures. The estimated compute budget required to produce all results in this work is around 2, 000 GPU hours (in terms of NVIDIA V100 GPUs). |
| Software Dependencies | No | The paper mentions methods like PGD-AT and FGSM, and architectures like Preact Res Net18, but it does not specify any software names with version numbers (e.g., Python, PyTorch, TensorFlow, or specific library versions) that would be needed to replicate the experiment. |
| Experiment Setup | Yes | Adversarial training for all methods and datasets follows the fast training schedules with a cyclic learning rate introduced in Wong et al. (2020). We train for 30 epochs on CIFAR Krizhevsky & Hinton (2009) and 15 epochs for SVHN Netzer et al. (2011) following Andriushchenko & Flammarion (2020). When we perform PGD-AT we use 10 steps and a step size α = 2/255; FGSM uses a step size of α = ϵ. Regularization parameters for Grad Align Andriushchenko & Flammarion (2020) and N-FGSM de Jorge et al. (2022) will vary and are stated when relevant in the paper. The architecture employed is a Preact Res Net18 He et al. (2016). |