AttackBench: Evaluating Gradient-based Attacks for Adversarial Examples

Authors: Antonio Emanuele Cinà, Jérôme Rony, Maura Pintor, Luca Demetrio, Ambra Demontis, Battista Biggio, Ismail Ben Ayed, Fabio Roli

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our extensive experimental analysis compares more than 100 attack implementations over 800 different configurations, considering both CIFAR-10 and Image Net models, and shows that only few attack implementations outperform all the remaining approaches. These findings suggest that novel defenses should be evaluated against different attacks than those normally used in the literature to avoid overly-optimistic robustness evaluations.
Researcher Affiliation Academia 1University of Genoa, Italy 2 Ecole de Technologie Superieure, Montr eal, Canada 3University of Cagliari, Italy EMAIL, EMAIL, EMAIL, EMAIL, EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode Yes Algorithm 1: Attack Benchmarking
Open Source Code Yes Code https://attackbench.github.io
Open Datasets Yes Dataset. We consider two popular datasets: CIFAR-10 (Krizhevsky 2009) and Image Net (Deng et al. 2009).
Dataset Splits Yes We evaluate the performance of adversarial attacks on the entire CIFAR-10 test set, and on a random subset of 5 000 samples from the Image Net validation set.
Hardware Specification Yes The execution time is measured on a shared compute cluster equipped with NVIDIA V100 SXM2 GPU (16GB memory).
Software Dependencies No The paper lists several adversarial attack libraries (Fool Box, Clever Hans, Adv Lib, ART, Torch Attacks, Deep Robust) and cites papers for them, but does not provide specific version numbers for these libraries or for underlying software components like Python or PyTorch.
Experiment Setup Yes For each considered attack implementation, we employed the default hyperparameters. We set the maximum number of forward and backward propagations Q to 2 000. For an attack that does a single forward prediction and backward gradient computation per optimization step, this corresponds to the common 1 000 steps budget found in several works (Brendel, Rauber, and Bethge 2018; Rony et al. 2019; Pintor et al. 2021; Rony et al. 2021), sufficient for algorithm convergence (Pintor et al. 2022). Exceptionally, most implementations of the CW attack use a default number of steps equal to 104 for multiple runs to find c, so we modify it to perform the search of the penalty weight and the attack iterations within the 2, 000 propagations. Furthermore, as done in (Rony et al. 2021; Cin a et al. 2024), for Fixed Budget attacks we leverage a line-search strategy to find the smallest budget ϑω for which the attack can successfully find an adversarial perturbation. Further details on the line search are reported in Appendix B.5. We use a batch size of 128 for CIFAR-10 and 32 for Image Net.