reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Model Immunization from a Condition Number Perspective

Authors: Amber Yijia Zheng, Site Bai, Brian Bullins, Raymond A. Yeh

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical results on linear models and non-linear deep-nets demonstrate the effectiveness of the proposed algorithm on model immunization. ... Beyond the theoretical results, we empirically validate the proposed algorithm on linear models for regression and image classification tasks. Lastly, we conduct experiments using the proposed algorithm on non-linear models, i.e., deep-nets.
Researcher Affiliation	Academia	1Department of Computer Science, Purdue University. Correspondence to: Raymond A. Yeh <EMAIL>.
Pseudocode	Yes	Algorithm 1 Condition number regularized gradient descent for model immunization... Pseudo-code is provided in Appendix C.3. ... We provide the Pseudo-code for implementing the dummy layer in Fig. 4 below.
Open Source Code	Yes	The code is available at https://github.com/amberyzheng/ model-immunization-cond-num.
Open Datasets	Yes	We use the regression task from the House prices dataset (Montoya & Data Canary, 2016). ... conduct experiments using MNIST (Le Cun, 1998). ... Image Net (Deng et al., 2009). ... Stanford Cars Dataset (Krause et al., 2013) and Country211 Dataset (Radford et al., 2021).
Dataset Splits	No	We split the data into DP and DH based on the feature MSZoning. ... The MNIST dataset consists of images over 10-digit classes, which can be formulated into 10 independent binary classification tasks. Across all pairs of tasks, we choose one to be the harmful task DH and the other the pre-training DP resulting in a total of 90 experiments. ... (Figure 3) Test Accuracy (%). The paper describes how DP and DH are constructed but does not explicitly provide train/test/validation splits for the experiments within these datasets.
Hardware Specification	No	The paper mentions experiments with Res Net18 and Vi T architectures and notes "Due to memory constraints" but does not specify any particular GPU models, CPU types, or other hardware details used for the experiments.
Software Dependencies	No	The paper mentions using PyTorch (implied by `torch.nn.Linear` in pseudocode), Adam optimizer, and SGD with Nesterov momentum, and references "Pytorch Image Models (Wightman, 2019)" but does not provide specific version numbers for any of these software components.
Experiment Setup	Yes	We summarize the hyper-parameters of training for model immunization in Tab. 4. ... For optimization, we use Adam (Kingma, 2014) with β = (.9, 0.999) and ϵ = 1 10 8... For optimization, we use SGD with Nesterov momentum... setting an initial learning rate of 1 10 5 with momentum 0.9. ... trainable feature extractor parameters are optimized with zero weight decay, while the classifier parameters use a weight decay of 2 10 5. All experiments are conducted using float64 precision.