reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Calibration Attacks: A Comprehensive Study of Adversarial Attacks on Model Confidence

Authors: Stephen Obadinma, Xiaodan Zhu, Hongyu Guo

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We propose four typical forms of calibration attacks: underconfidence, overconfidence, maximum miscalibration, and random confidence attacks, conducted in both black-box and white-box setups. We demonstrate that the attacks are highly effective on both convolutional and attention-based models... We further investigate the effectiveness of a wide range of adversarial defence and recalibration methods... From the ECE and KS scores, we observe that there are still significant limitations... Section 4 Experiments
Researcher Affiliation	Academia	Stephen Obadinma EMAIL Department of Electrical and Computer Engineering & Ingenuity Labs Research Institute, Queen s University Xiaodan Zhu EMAIL Department of Electrical and Computer Engineering & Ingenuity Labs Research Institute, Queen s University Hongyu Guo EMAIL Digital Technologies Research Centre, National Research Council Canada
Pseudocode	Yes	Algorithm 1 A Brief Overview of Our Calibration Attack Framework
Open Source Code	Yes	Our code is available at https://github.com/Phenet Os/Calibration Attack
Open Datasets	Yes	We performed a comprehensive study on CIFAR-100 (Krizhevsky & Hinton, 2009) and Caltech101 (Fei-Fei et al., 2004). We also included the German Traffic Sign Recognition Benchmark (GTSRB) (Houben et al., 2013)
Dataset Splits	Yes	For CIFAR-100 and GTSRB, we use the predefined training and test sets for both but use 10% of the training data for validation purposes. For Caltech-101, which comes without predetermined splits, we use an 80:10:10 train/validation/test split.
Hardware Specification	Yes	All of the training occurred on 24 GB Nvidia RTX-3090 and RTX Titan GPUs.
Software Dependencies	No	We use the Foolbox implementation of the PGD attack (Rauber et al., 2020; 2017).
Experiment Setup	Yes	The hyperparameters we used for training the Res Net-50 models include: a batch size of 128, with a Cosine Annealing LR scheduler, 0.9 momentum, 5e-4 weight decay, and a stochastic gradient descent (SGD) optimizer. For Vi T, the settings are the same, except we also use gradient clipping with the max norm set to 1.0. We conduct basic grid search hyperparameter tuning over a few values for the learning rate (0.1,0.01,0.005,0.001) and training duration (in terms of epochs). Generally, we found that a learning rate of 0.01 worked best for both types of models. The training times vary for each dataset and model. For the Res Net-50 models we trained for 15 epochs on CIFAR-100, 10 epochs on Caltech-101, and 7 epochs on GTSRB. Likewise for Vi T, we trained for 10 epochs on CIFAR-100, 15 epochs on Caltech-101, and 5 epochs on GTSRB.