reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Provably Cost-Sensitive Adversarial Defense via Randomized Smoothing

Authors: Yuan Xin, Dingfan Chen, Michael Backes, Xiao Zhang

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on benchmark datasets, including challenging ones unsolvable by existing methods, demonstrate the effectiveness of our certification algorithm and training method across various cost-sensitive scenarios.
Researcher Affiliation	Academia	1CISPA Helmholtz Center for Information Security, Saarbr ucken, Germany 2Max Planck Institute for Intelligent Systems, T ubingen, Germany.
Pseudocode	Yes	Algorithm 1 Certification for Cost-Sensitive Robustness
Open Source Code	Yes	To ensure reproducibility and accessibility, our method and the implementations of our experiments are available as open source code at: https://github.com/ Apple XY/Cost-Sensitive-RS.
Open Datasets	Yes	We evaluate our method on the standard benchmark datasets: CIFAR-10 (Krizhevsky et al., 2009), Imagenette1, and the full Image Net dataset (Deng et al., 2009). In addition, we assess its performance on the real-world medical dataset HAM10k (Tschandl et al., 2018)
Dataset Splits	Yes	For CIFAR-10 and HAM10k, we use the Res Net architecture following Cohen et al. (2019) as the target classification model... For Image Net, we use Res Net-18, following Pethick et al. (2023). ...The CIFAR-10 dataset...with 50,000 training images and 10,000 test images...The Image Net dataset...is divided into a training set with 1.2 million images and a validation set with 50,000 images.
Hardware Specification	Yes	For CIFAR-10, Imagenette, and HAM10k, each experiment is run on a single NVIDIA A100 GPU with 40 GB of memory within one day. For the Image Net dataset, each experiment is conducted on four NVIDIA A100 GPUs with 40 GB of memory for 1-2 days.
Software Dependencies	No	The paper does not explicitly mention specific software dependencies with version numbers for its implementation.
Experiment Setup	Yes	Consistent with common evaluation practices (Cohen et al., 2019), we focus on the setting of ϵ = 0.5 and σ = 0.5 in our experiments, while we observe similar trends under other settings (see Appendix D for all the additional experimental results)...For Gaussian-CS, Smooth Adv-CS and Smooth Mix-CS, the parameter λ is carefully tuned...For MACER, the parameter λ ... is fixed at 4 by default. Similarly, in our Margin-CS method, we set λ1 = 3 and λ2 = 3 according to observation from Table 7. We present the results of varying hyperparameters γ1 and γ2 in our Margin-CS method...