reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Algorithmic Stability Based Generalization Bounds for Adversarial Training

Authors: Runzhi Tian, Yongyi Mao

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our additional experiments (e.g., Figure 1) suggests that this is quite common. In Figure 1, we perform AT with a 3-step PGD and measure the error of the model against 3-step PGD attack as well as its standard error in the training process. ... Experiments We conduct experiment for PGD-AT when G is chosen as tanhγ as well as the identity map.
Researcher Affiliation	Academia	Runzhi Tian University of Ottawa EMAIL Yongyi Mao University of Ottawa EMAIL
Pseudocode	No	The paper describes the AT algorithm iteratively with equations (7) and (8) but does not present it in a structured pseudocode or algorithm block.
Open Source Code	Yes	Code is available at https://github.com/rz Tian/AT-Stability.
Open Datasets	Yes	Specifically, Rice et al. (2020) shows that on the CIFAR-10 dataset (Krizhevsky et al., 2009)... The experiments are conducted on CIFAR-10, CIFAR-100 (Krizhevsky et al., 2009) and SVHN(Netzer et al., 2011).
Dataset Splits	No	The paper mentions evaluating on 'training and testing sets' (e.g., in Figure 1 caption and Section 5), but does not specify the exact split percentages, sample counts, or the methodology used to create these splits.
Hardware Specification	Yes	Training PRN-18 on CIFAR-10 and SVHN for 200 epochs spends around 18 hours with two NVIDIA V100 GPUs, and training WRN-34 on CIFAR-100 requires around three days to complete with the same computing resources.
Software Dependencies	No	The paper mentions model architectures like 'pre-activation Res Net 18 (PRN-18)' and 'Wide Res Net 34 (WRN-34)' but does not specify any software libraries or frameworks with their version numbers, such as Python, PyTorch, or TensorFlow versions.
Experiment Setup	Yes	In our experiments, we follow the settings in Rice et al. (2020): The perturbation radius is set to be ϵ = 8/255 w.r.t the norm for the three datasets. ... We set K = 10 for all the PGD variants with λ = 2/255 on CIFAR-10 and CIFAR-100, and set λ = 1/255 for SVHN. The initial learning rate of AT is set to be 0.1 for CIFAR-10 and CIFAR-100 and set to be 0.01 for SVHN. The learning rate is decayed by 0.1 at the 100th and the 150th epoch of the training. The batch size is set to be 128 and a weight decay of 5 × 10−4 is used for all the experiments.