On Intriguing Layer-Wise Properties of Robust Overfitting in Adversarial Training
Authors: Duke Nguyen, Chaojian Yu, Vinoth Nandakumar, Young Choon Lee, Tongliang Liu
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show that the proposed RAT prototype effectively eliminates robust overfitting. The contributions of this work are summarized as follows: ... We design two different realizations of RAT, with extensive experiments on a number of standard benchmarks, verifying its effectiveness. |
| Researcher Affiliation | Academia | Duke Nguyen EMAIL Sydney AI Center The University of Sydney; Chaojian Yu EMAIL Sydney AI Center The University of Sydney; Vinoth Nandakumar EMAIL Sydney AI Center The University of Sydney; Young Choon Lee EMAIL School of Computing Macquarie University; Tongliang Liu EMAIL Sydney AI Center The University of Sydney |
| Pseudocode | Yes | Algorithm 1 RAT-prototype (in a mini-batch). Require: base adversarial training algorithm A, optimizer O, network fw, model parameter w = {w1, w2, ..., wn}, training data D = {(xi, yi)}, mini-batch B, front and latter layer conditions Cfront and Clatter for fw, gradient adjustment strategy S. |
| Open Source Code | No | No explicit statement about providing open-source code for the methodology or a link to a code repository is found in the paper. |
| Open Datasets | Yes | CIFAR-10 (Krizhevsky et al., 2009). The CIFAR-10 dataset (Canadian Institute for Advanced Research, 10 classes) is a subset of the Tiny Images dataset and consists of 60000 32x32 color images... CIFAR-100 (Krizhevsky et al., 2009). The CIFAR-100 dataset ... SVHN (Netzer et al., 2011). Street View House Numbers (SVHN) is a digit classification benchmark dataset... |
| Dataset Splits | Yes | The CIFAR-10 dataset ... There are 6000 images per class with 5000 training and 1000 testing images per class. ... The CIFAR-100 dataset ... There are 500 training images and 100 testing images per class. ... SVHN ... has three sets: training, testing sets and an extra set with 530,000 images that are less difficult and can be used for helping with the training process. |
| Hardware Specification | No | No specific hardware details (like GPU models, CPU types, or memory) used for running experiments are mentioned in the paper. |
| Software Dependencies | No | The paper mentions using SGD with momentum and weight decay but does not specify versions of any software libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages (e.g., Python). |
| Experiment Setup | Yes | We use Pre Act Res Net-18 (He et al., 2016) and Wide Res Net-34-10 (Zagoruyko & Komodakis, 2016) following the same hyperparameter settings for AT in Rice et al. (2020): for L∞ threat model, ϵ = 8/255, step size is 1/255 for SVHN, and 2/255 for CIFAR-10 and CIFAR-100; for L2 threat model, ϵ = 128/255, step size is 15/255 for all datasets. For training, all models are trained under 10-step PGD (PGD-10) attack for 200 epochs using SGD with momentum 0.9, weight decay 5 × 10−4, and a piecewise learning rate schedule with an initial learning rate of 0.1. Standard data augmentation techniques, including random cropping with 4 pixels of padding and random horizontal flips, are applied. |