Rethinking Invariance Regularization in Adversarial Training to Improve Robustness-Accuracy Trade-off
Authors: Futa Waseda, Ching-Chun Chang, Isao Echizen
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our method on CIFAR-10, CIFAR-100 (Krizhevsky & Hinton, 2009), and Imagenette (Howard, 2019) datasets. We use the model architectures of Res Net-18 (He et al., 2016) and Wide Res Net-34-10 (WRN-34-10) (Zagoruyko & Komodakis, 2016), following the previous works (Madry et al., 2018; Zhang et al., 2019; Cui et al., 2021). Implementation details of baselines are provided in Appendix A.3. 5 EMPIRICAL STUDY |
| Researcher Affiliation | Academia | Futa Waseda1 Ching-Chun Chang2 Isao Echizen1,2 1The University of Tokyo 2National Institute of Informatics EMAIL, EMAIL |
| Pseudocode | No | The paper describes the methodology using mathematical formulas and natural language, but no structured pseudocode or algorithm blocks are present. |
| Open Source Code | No | Reproducability. We provide the hyperparameters used in our experiments in Sec. 4. We also provide the additional implementation details of our method and the baseline methods in the appendix (Appendix A.2 and Appendix A.3). The code will be made available upon publication. |
| Open Datasets | Yes | Models and datasets. We evaluate our method on CIFAR-10, CIFAR-100 (Krizhevsky & Hinton, 2009), and Imagenette (Howard, 2019) datasets. |
| Dataset Splits | Yes | Table 9: The details of datasets we used in our experiments. Dataset Resolution Class Num. Train Val CIFAR10 32 32 10 50,000 10,000 CIFAR100 32 32 100 50,000 10,000 Imagenette 160 160 10 9,469 3,925 |
| Hardware Specification | Yes | A.2 EXPERIMENTS COMPUTE RESOURCES In this work, we use NVIDIA A100 GPUs for our experiments. Training of AR-AT on CIFAR10 using Res Net-18 takes approximately 2 hours, and on CIFAR10 using Wide Res Net-34-10 takes approximately 10 hours. |
| Software Dependencies | No | The paper mentions using Python and specific baseline implementations but does not list multiple key software components with their specific version numbers for the authors' methodology. |
| Experiment Setup | Yes | Training details. We use a 10-step PGD for adversarial training. We initialized the learning rate to 0.1, divided it by a factor of 10 at the 75th and 90th epochs, and trained for 100 epochs. We use the SGD optimizer with a momentum of 0.9 and a weight decay of 5e-4, with a batch size of 128. Cosine Distance is used as the distance metric for invariance regularization. The predictor MLP has two linear layers, with the hidden dimension set to 1/4 of the feature dimension, following Sim Siam (Chen & He, 2021). The latent representations to be regularized are spatially average-pooled to obtain one-dimensional vectors. The regularization strength γ is set to 30.0 for Res Net-18 and 100.0 for WRN-34-10. We regularize all Re LU outputs in layer4 for Res Net-18, and layer3 for WRN-34-10. |