reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Efficient Robustness Evaluation via Constraint Relaxation

Authors: Chao Pan, Yu Wu, Ke Tang, Qing Li, Xin Yao

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on 105 robust models show that CR Attack outperforms Auto Attack in both attack success rate and efficiency, reducing forward and backward propagation time by 38.3 and 15.9 respectively. Through comprehensive analysis, we validate that the constraint relaxation mechanism is crucial for the method s effectiveness.
Researcher Affiliation	Academia	1Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China 2The Hong Kong Polytechnic University, Hong Kong, China 3School of Data Science, Lingnan University, Hong Kong, China EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1: Constraint Relaxation Attack
Open Source Code	Yes	Code https://github.com/fzjcdt/constraint-relaxation-attack
Open Datasets	Yes	We conduct our experiments using three standard datasets from the Robust Bench framework: CIFAR-10, CIFAR-100, and Image Net.
Dataset Splits	Yes	For Image Net, following Robust Bench protocol, we utilize the first 5000 images from the validation subset to ensure standardized comparison. The problem was posited as an adversarial attack using the l norm. To ensure appropriateness and impartiality of the analysis, we drew from all viable models available in the Robust Bench (Croce et al. 2020) without any bias towards specific selections.
Hardware Specification	Yes	For instance, evaluating the top two models on Robust Bench s CIFAR-10 leaderboard (Krizhevsky, Hinton et al. 2009; Croce et al. 2020; Bartoldson et al. 2024; Amini et al. 2024) requires approximately 114.6 and 177.0 hours respectively on a single NVIDIA Tesla V100 GPU.
Software Dependencies	No	The paper does not explicitly state versions of specific software libraries or tools used for implementation, beyond mentioning frameworks like Auto Attack (which is compared against).
Experiment Setup	Yes	Following Robust Bench guidelines, we set the perturbation size to 8/255 for CIFAR-10 and CIFAR100, and 4/255 for Image Net. We set the maximum number of iterations (T) to 150 and the decay steps (S) to 30, using margin loss as the loss function. The number of restarts was fixed at 5. To accelerate the process, from the second restart onwards, we only sampled instances with a margin loss less than 0.05 from the previous attempt.