Boosting Adversarial Robustness with CLAT: Criticality Leveraged Adversarial Training
Authors: Bhavna Gopal, Huanrui Yang, Jingyang Zhang, Mark Horton, Yiran Chen
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results demonstrate that CLAT can be seamlessly integrated with existing adversarial training methods, enhancing clean accuracy and adversarial robustness by over 2% compared to baseline approaches. Experiments and Settings Datasets and models We conducted experiments using CIFAR10 and CIFAR100, typical choice for previous robustness research. Table 1, Table 2, and Table 3 present white-box evaluation results using the PGD and Auto Attack frameworks across CIFAR and Image Net. |
| Researcher Affiliation | Academia | 1Department of Electrical and Computer Engineering, Duke University, Durham, NC, USA 2Department of Electrical and Computer Engineering, University of Arizona, Tucson, AZ, USA. Correspondence to: Bhavna Gopal <EMAIL>. |
| Pseudocode | Yes | C. Pseudocode of CLAT To better facilitate an understanding of the CLAT process, we illustrate the pseudocode of the dynamic critical layer identification process and the criticality-targeted fine-tuning process in Algorithm 1. Algorithm 1 CLAT Algorithm 1: Input: Dataset D, pre-trained model F, batch size bs, total epochs N. |
| Open Source Code | No | The paper does not contain any explicit statement about the release of source code, nor does it provide a link to a code repository in the main text or appendices. |
| Open Datasets | Yes | Datasets and models We conducted experiments using CIFAR10 and CIFAR100, typical choice for previous robustness research. Each dataset includes 60,000 color images, each 32 32 pixels, divided into 10 and 100 classes respectively (Krizhevsky & Hinton, 2009). For our experiments, we deployed a suite of network architectures: Wide Resnets (34-10, 70-16) (Zagoruyko & Komodakis, 2017), Res Nets (50, 18) (He et al., 2016b), Dense Net-121 (Huang et al., 2017), Pre Act Res Net-18(He et al., 2016a), and VGG-19 (Simonyan & Zisserman, 2015). In this paper, these architectures are referred to as WRN34-10, WRN70-16, RN50, RN18, DN121, Preact RN18 and VGG19 respectively. Table 2. Clean and Adversarial Accuracies (PGD-20) performance comparison on Image Net. |
| Dataset Splits | Yes | Datasets and models We conducted experiments using CIFAR10 and CIFAR100, typical choice for previous robustness research. Each dataset includes 60,000 color images, each 32 32 pixels, divided into 10 and 100 classes respectively (Krizhevsky & Hinton, 2009). For our experiments, we deployed a suite of network architectures: Wide Resnets (34-10, 70-16) (Zagoruyko & Komodakis, 2017), Res Nets (50, 18) (He et al., 2016b), Dense Net-121 (Huang et al., 2017), Pre Act Res Net-18(He et al., 2016a), and VGG-19 (Simonyan & Zisserman, 2015). In this paper, these architectures are referred to as WRN34-10, WRN70-16, RN50, RN18, DN121, Preact RN18 and VGG19 respectively. |
| Hardware Specification | Yes | Experiments were conducted on a Titan XP GPU, starting with an initial learning rate of 0.1, which was adjusted according to a cosine decay schedule. To ensure the reliability of robustness measurements, we conducted each experiment a minimum of 10 times, reporting the lowest adversarial accuracies we observed. Table 11. Dense Net-121 critical layers identified with different amounts of data. Time taken to compute critical layers evaluated on a TITAN RTX GPU. 1 PGD-AT epoch takes 67s. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers, such as programming languages, libraries, or frameworks. It mentions various methods and models, but not the specific software implementations and their versions. |
| Experiment Setup | Yes | For our baseline, we use PGD for attack generation during training, following a random start (Madry et al., 2019), with an attack budget of ϵ = 0.03 under the ℓ norm, a step size of α = 0.007, and 10 attack steps. The same settings apply to PGD attack evaluations. Auto Attack evaluations (Croce & Hein, 2020) also use a budget of ϵ = 0.03 under the ℓ norm, with no restarts for untargeted APGD-CE, 9 target classes for APGD-DLR, 3 target classes for Targeted FAB, and 5000 queries for Square Attack. These settings remain consistent unless explicitly noted otherwise. Experiments were conducted on a Titan XP GPU, starting with an initial learning rate of 0.1, which was adjusted according to a cosine decay schedule. To ensure the reliability of robustness measurements, we conducted each experiment a minimum of 10 times, reporting the lowest adversarial accuracies we observed. CLAT settings We select critical layers as described in Section 3.1. Table 10 outlines the Top-5 most critical layers for some of the models and corresponding datasets at the start of the CLAT fine-tuning, after adversarially training with PGD-AT for 50 epochs. In customizing the CLAT methodology to various network sizes, we select approximately 5% of layers as critical through hyperparameter optimization. For instance, DN121 uses 5 critical layers, while WRN70-16, RN50, WRN34-10, VGG19, and RN18 use 4, 3, 2, 1, and 1 critical layers, respectively. |