reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Long-tailed Adversarial Training with Self-Distillation

Authors: Seungju Cho, Hongsin Lee, Changick Kim

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our extensive experiments demonstrate state-of-the-art performance in both clean and robust accuracy for long-tailed adversarial robustness, with significant improvements in tail class performance on various datasets. We improve the accuracy against PGD attacks for tail classes by 20.3, 7.1, and 3.8 percentage points on CIFAR-10, CIFAR-100, and Tiny Image Net, respectively, while achieving the highest robust accuracy.
Researcher Affiliation	Academia	Seungju Cho , Hongsin Lee , Changick Kim Korea Advanced Institute of Science and Technology (KAIST) EMAIL
Pseudocode	Yes	Algorithm 1 Main Algorithm
Open Source Code	No	The paper does not contain any explicit statements or links indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	Dataset. We conducted experiments using long-tailed distribution datasets: CIFAR-10-LT, CIFAR100-LT (Krizhevsky et al., 2009), and Tiny-Image Net-LT (Le & Yang, 2015), with various imbalance ratios (IR), primarily set at 50 for CIFAR-10-LT, 10 for CIFAR-100-LT and Tiny-Image Net-LT.
Dataset Splits	No	The paper mentions using long-tailed versions of CIFAR-10, CIFAR-100, and Tiny-Image Net datasets, and discusses 'Test Accuracy' and 'Tail clean accuracy', implying the existence of a test set. However, it does not explicitly provide the specific training, validation, and test split percentages or sample counts for these datasets, nor does it reference standard splits with specific citations or file names.
Hardware Specification	No	The paper does not contain any specific details about the hardware used to run the experiments, such as GPU models, CPU types, or cloud computing specifications.
Software Dependencies	No	The paper mentions model architectures (Res Net-18, Wide Res Net-34-10, Pre Act Res Net-18) and adversarial training methods (PGD-AT, TRADES, MART, AWP), but it does not specify any software libraries or frameworks with their version numbers (e.g., PyTorch 1.x, TensorFlow 2.x, Python 3.x).
Experiment Setup	Yes	Training details. We employed Res Net-18 (He et al., 2016a) and Wide Res Net-34-10 (Zagoruyko & Komodakis, 2016) architectures for CIFAR-10/100-LT, and results for Wide Res Net-34-10 are included in the appendix. For Tiny-Image Net-LT, we employed Pre Act Res Net-18 (He et al., 2016b). Initially, we trained a balanced self-teacher using the same model architecture for 30 epochs using a batch size of 32 with a balanced dataset, resampled by the original long-tailed dataset with γ = IR/2. In the main training phase, we trained for 100 epoch using a batch size of 128 with selfdistillation from the balanced self-teacher. We utilized SGD optimization to train both the balanced self-teacher and the main model, setting the learning rate to 0.1 and weight decay to 5 10 4. We used an epsilon boundary of 8/255, a commonly used setting in adversarial training, and employed a 10-step PGD attack during training.