Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Rethinking Softmax Cross-Entropy Loss for Adversarial Robustness
Authors: Tianyu Pang, Kun Xu, Yinpeng Dong, Chao Du, Ning Chen, Jun Zhu
ICLR 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we empirically demonstrate several attractive merits of applying the MMC loss. We experiment on the widely used MNIST, CIFAR-10, and CIFAR-100 datasets (Krizhevsky & Hinton, 2009; Le Cun et al., 1998). |
| Researcher Affiliation | Collaboration | Tianyu Pang, Kun Xu, Yinpeng Dong, Chao Du, Ning Chen, Jun Zhu Dept. of Comp. Sci. & Tech., BNRist Center, Institute for AI, Tsinghua University; Real AI EMAIL, EMAIL |
| Pseudocode | Yes | We give the generation algorithm for crafting the Max-Mahalanobis Centers in Algorithm 1, proposed by Pang et al. (2018). |
| Open Source Code | Yes | The codes are provided in https://github.com/P2333/Max-Mahalanobis-Training. |
| Open Datasets | Yes | We experiment on the widely used MNIST, CIFAR-10, and CIFAR-100 datasets (Krizhevsky & Hinton, 2009; Le Cun et al., 1998). |
| Dataset Splits | No | The paper uses standard datasets (MNIST, CIFAR-10, CIFAR-100) and mentions training epochs, but does not explicitly provide the train/validation/test dataset splits (e.g., percentages, sample counts, or citations to specific split methodologies) used to reproduce the experiments. |
| Hardware Specification | Yes | Most of our experiments are conducted on the NVIDIA DGX-1 server with eight Tesla P100 GPUs. |
| Software Dependencies | No | The paper mentions the use of the momentum SGD optimizer but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, or specific library versions). |
| Experiment Setup | Yes | For each training loss with or without the AT mechanism, we apply the momentum SGD (Qian, 1999) optimizer with the initial learning rate of 0.01, and train for 40 epochs on MNIST, 200 epochs on CIFAR-10 and CIFAR-100. The learning rate decays with a factor of 0.1 at 100 and 150 epochs, respectively. When applying the AT mechanism (Madry et al., 2018), the adversarial examples for training are crafted by 10-steps targeted or untargeted PGD with ϵ = 8/255. ... we choose the perturbation ϵ = 8/255 and 16/255, with the step size be 2/255. |