reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Rethinking Softmax Cross-Entropy Loss for Adversarial Robustness

Authors: Tianyu Pang, Kun Xu, Yinpeng Dong, Chao Du, Ning Chen, Jun Zhu

ICLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we empirically demonstrate several attractive merits of applying the MMC loss. We experiment on the widely used MNIST, CIFAR-10, and CIFAR-100 datasets (Krizhevsky & Hinton, 2009; Le Cun et al., 1998).
Researcher Affiliation	Collaboration	Tianyu Pang, Kun Xu, Yinpeng Dong, Chao Du, Ning Chen, Jun Zhu Dept. of Comp. Sci. & Tech., BNRist Center, Institute for AI, Tsinghua University; Real AI EMAIL, EMAIL
Pseudocode	Yes	We give the generation algorithm for crafting the Max-Mahalanobis Centers in Algorithm 1, proposed by Pang et al. (2018).
Open Source Code	Yes	The codes are provided in https://github.com/P2333/Max-Mahalanobis-Training.
Open Datasets	Yes	We experiment on the widely used MNIST, CIFAR-10, and CIFAR-100 datasets (Krizhevsky & Hinton, 2009; Le Cun et al., 1998).
Dataset Splits	No	The paper uses standard datasets (MNIST, CIFAR-10, CIFAR-100) and mentions training epochs, but does not explicitly provide the train/validation/test dataset splits (e.g., percentages, sample counts, or citations to specific split methodologies) used to reproduce the experiments.
Hardware Specification	Yes	Most of our experiments are conducted on the NVIDIA DGX-1 server with eight Tesla P100 GPUs.
Software Dependencies	No	The paper mentions the use of the momentum SGD optimizer but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, or specific library versions).
Experiment Setup	Yes	For each training loss with or without the AT mechanism, we apply the momentum SGD (Qian, 1999) optimizer with the initial learning rate of 0.01, and train for 40 epochs on MNIST, 200 epochs on CIFAR-10 and CIFAR-100. The learning rate decays with a factor of 0.1 at 100 and 150 epochs, respectively. When applying the AT mechanism (Madry et al., 2018), the adversarial examples for training are crafted by 10-steps targeted or untargeted PGD with ϵ = 8/255. ... we choose the perturbation ϵ = 8/255 and 16/255, with the step size be 2/255.