Mitigating Feature Gap for Adversarial Robustness by Feature Disentanglement

Authors: Nuoyan Zhou, Dawei Zhou, Decheng Liu, Nannan Wang, Xinbo Gao

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical evaluations on three benchmark datasets demonstrate that our approach surpasses existing adversarial fine-tuning methods and adversarial training baselines. Experiments This section presents experiments with AFD. We first introduce our experiment setting. We further make comparisons on multiple architectures and datasets to show the effectiveness. Then we conduct an ablation study to show the functions of the disentangling module and alignment. Finally, we conduct an empirical analysis to support our hypothesis.
Researcher Affiliation Academia 1State Key Laboratory of Integrated Services Networks, Xidian University, Xi an, China 2Chongqing Key Laboratory of Image Cognition, Chongqing University of Posts and Telecommunications, Chongqing, China EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode Yes Algorithm 1: Adversarial Fine-tuning via Disentanglement
Open Source Code No The paper does not contain any explicit statement or link regarding the public release of its source code.
Open Datasets Yes We conduct experiments on three benchmark datasets including CIFAR-10 and CIFAR-100 (Krizhevsky 2009), Tiny-Image Net (Deng et al. 2009).
Dataset Splits Yes CIFAR-10 dataset contains 60,000 color images having a size of 32 32 in 10 classes, with 50,000 training and 10,000 test images. CIFAR-100 dataset contains 50,000 training and 10,000 test images in 100 classes. Tiny-Image Net dataset contains 100000 images of 200 classes (500 for each class) downsized to 64 64 colored images. Each class has 500 training images, 50 validation images and 50 test images. We only use training and validation images of Tiny-Image Net.
Hardware Specification No The paper mentions using Res Net18 and Wide Res Net-34-10 architectures with their parameter counts, but does not specify any particular GPU models, CPU models, or other hardware specifications used for running the experiments.
Software Dependencies No The paper mentions using 'Stochastic Gradient Descent (SGD) optimizer' and 'Exponential Moving Average (EMA)', but does not provide specific software names with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes The optimizer is Stochastic Gradient Descent (SGD) optimizer with a momentum of 0.9. We fine-tune the pre-trained models with a batch size of 128, a learning rate of 0.0025, a weight decay of 5.0 10 4, and training epochs of 20... For hyperparameters {α, β, γ}, we suggest {0.05, 0.25, 25} in most cases... The maximum perturbation is set to 8/255. L -norm PGD with a random start, a step size of 0.003, and attack iterations of 20 is utilized in the evaluation.