reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Mitigating Feature Gap for Adversarial Robustness by Feature Disentanglement

Authors: Nuoyan Zhou, Dawei Zhou, Decheng Liu, Nannan Wang, Xinbo Gao

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical evaluations on three benchmark datasets demonstrate that our approach surpasses existing adversarial fine-tuning methods and adversarial training baselines. Experiments This section presents experiments with AFD. We first introduce our experiment setting. We further make comparisons on multiple architectures and datasets to show the effectiveness. Then we conduct an ablation study to show the functions of the disentangling module and alignment. Finally, we conduct an empirical analysis to support our hypothesis.
Researcher Affiliation	Academia	1State Key Laboratory of Integrated Services Networks, Xidian University, Xi an, China 2Chongqing Key Laboratory of Image Cognition, Chongqing University of Posts and Telecommunications, Chongqing, China EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1: Adversarial Fine-tuning via Disentanglement
Open Source Code	No	The paper does not contain any explicit statement or link regarding the public release of its source code.
Open Datasets	Yes	We conduct experiments on three benchmark datasets including CIFAR-10 and CIFAR-100 (Krizhevsky 2009), Tiny-Image Net (Deng et al. 2009).
Dataset Splits	Yes	CIFAR-10 dataset contains 60,000 color images having a size of 32 32 in 10 classes, with 50,000 training and 10,000 test images. CIFAR-100 dataset contains 50,000 training and 10,000 test images in 100 classes. Tiny-Image Net dataset contains 100000 images of 200 classes (500 for each class) downsized to 64 64 colored images. Each class has 500 training images, 50 validation images and 50 test images. We only use training and validation images of Tiny-Image Net.
Hardware Specification	No	The paper mentions using Res Net18 and Wide Res Net-34-10 architectures with their parameter counts, but does not specify any particular GPU models, CPU models, or other hardware specifications used for running the experiments.
Software Dependencies	No	The paper mentions using 'Stochastic Gradient Descent (SGD) optimizer' and 'Exponential Moving Average (EMA)', but does not provide specific software names with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	The optimizer is Stochastic Gradient Descent (SGD) optimizer with a momentum of 0.9. We fine-tune the pre-trained models with a batch size of 128, a learning rate of 0.0025, a weight decay of 5.0 10 4, and training epochs of 20... For hyperparameters {α, β, γ}, we suggest {0.05, 0.25, 25} in most cases... The maximum perturbation is set to 8/255. L -norm PGD with a random start, a step size of 0.003, and attack iterations of 20 is utilized in the evaluation.