reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Robust Models are less Over-Confident

Authors: Julia Grabinski, Paul Gavrikov, Janis Keuper, Margret Keuper

NeurIPS 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this paper, we empirically analyze a variety of adversarially trained models that achieve high robust accuracies when facing state-of-the-art attacks and we show that AT has an interesting side-effect: it leads to models that are significantly less overconfident with their decisions, even on clean data than non-robust models. Our experiments for 71 robust and non-robust model pairs on the datasets CIFAR10 [43], CIFAR100 and Image Net [19] confirm that non-robust models are overconfident with their false predictions.
Researcher Affiliation	Academia	Julia Grabinski Fraunhofer ITWM, Kaiserslautern Visual Computing, University of Siegen EMAIL Paul Gavrikov IMLA, Offenburg University Janis Keuper Fraunhofer ITWM, Kaiserslautern IMLA, Offenburg University Margret Keuper University of Siegen Max Planck Institute for Informatics Saarland Informatics Campus Saarbrücken
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Data & Project website: https://github.com/Ge Julia/robustness_ confidences_evaluation
Open Datasets	Yes	Our experiments for 71 robust and non-robust model pairs on the datasets CIFAR10 [43], CIFAR100 and Image Net [19] confirm that non-robust models are overconfident with their false predictions.
Dataset Splits	Yes	CIFAR10 [43] is a simple ten class dataset consisting of 50,000 training and 10,000 validation images with a resolution of 32 32.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running experiments.
Software Dependencies	No	The paper does not provide specific details about software dependencies, including version numbers for libraries or frameworks used.
Experiment Setup	Yes	Training details can be found in appendix A.