reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Training Deep Neural Networks with Virtual Smoothing Classes

Authors: Zhiyang Zhou, Siwei Wei, Xudong Zhang, Wensheng Dou, Muzi Qu, Yan Cai

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 Experiments We evaluate the impact of VS labels on model accuracy, calibration, and knowledge distillation (Sec 4.1), followed by assessing their effects on robustness and robust distillation in adversarial settings (Sec 4.2), as well as their effects on out-of-distribution detection (Sec 4.3). ... Tab 2 shows the test accuracy (%). ... Tab 4 shows the results, where the ECE corresponding to the α yielding the highest accuracy enclosed in brackets []. ... Tab 5 shows the results of KD... Tab 6 and Tab 7 show the test robustness... Tab 11 shows the results, where the number of reject classes in SSL is K. ... 4.4 Ablation Studies In this section, we study how different confidences and numbers of VS classes affect accuracy...
Researcher Affiliation	Academia	1Key Laboratory of System Software (Chinese Academy of Sciences) and State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, China 2University of Chinese Academy of Sciences, Beijing, China 3Nanjing Institute of Software Technology, University of Chinese Academy of Sciences, Nanjing, China EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes the proposed approach and experimental settings in text and mathematical formulas, but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Code https://github.com/zhiyang3344/virtual-smoothing
Open Datasets	Yes	Experimental Settings On SVHN (Netzer et al. 2011), CIFAR10, and CIFAR100 (Krizhevsky, Hinton et al. 2009), we train Res Net-18 (He et al. 2016) and Res Ne Xt-29 (2x64d) (Xie et al. 2017) for 200 epochs... On Tiny-Image Net-2001, we select Res Net18 and Res Ne Xt-50 (32x4d) and use the same settings for training. On Image Net (Russakovsky et al. 2015), we train Res Net-18 and Res Ne Xt-50 (32x4d) for 120 epochs...
Dataset Splits	No	The paper mentions using standard datasets such as SVHN, CIFAR10, CIFAR100, Tiny-ImageNet-200, and ImageNet, and reports "test accuracy", implying standard splits were used. However, it does not explicitly provide the specific percentages, sample counts, or detailed methodology for these dataset splits within the text.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU types, memory specifications) used for running the experiments.
Software Dependencies	No	The paper mentions using optimizers like SGD and AdamW but does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	On SVHN (Netzer et al. 2011), CIFAR10, and CIFAR100 (Krizhevsky, Hinton et al. 2009), we train Res Net-18 (He et al. 2016) and Res Ne Xt-29 (2x64d) (Xie et al. 2017) for 200 epochs using SGD optimizer with momentum 0.9, weight decay 0.0001, batch size 128 and an initial Learning Rate (LR) 0.1 divided by 10 at the 100-th and 150-th epochs. ... On Image Net (Russakovsky et al. 2015), we train Res Net-18 and Res Ne Xt-50 (32x4d) for 120 epochs with similar settings but set batch size to 256 and divide the LR by 10 at the 60-th, 90-th and 110-th epochs. ... We use the Adam W scheduler with an initial LR of 0.001, batch sizes 256 (512), weight decay 0.05 (0.065) for T2T-Vi T-14 (T2T-Vi T-24). ... AT and TRADES are trained for 160 epochs using SGD with momentum 0.9, weight decay 5e-4, batch size 128, and an initial LR 0.1 divided by 10 at the 150th and 155th epochs. ... The training attack is PGD-10 with a step-size of 0.00784 ( 2/255).