reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Certified Unlearning for Neural Networks

Authors: Anastasia Koloskova, Youssef Allouah, Animesh Jha, Rachid Guerraoui, Sanmi Koyejo

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We analyze the theoretical trade-offs in efficiency and accuracy and demonstrate empirically that our method not only achieves formal unlearning guarantees but also performs effectively in practice, outperforming existing baselines. Our code is available at https://github.com/ stair-lab/certified-unlearningneural-networks-icml-2025
Researcher Affiliation	Academia	1Stanford University, USA 2EPFL, Switzerland. Correspondence to: Anastasia Koloskova <EMAIL>, Youssef Allouah <EMAIL>.
Pseudocode	No	The paper describes algorithms using equations and prose (e.g., in Section 3 'Algorithm'), but it does not present any formal pseudocode blocks or clearly labeled algorithm figures with structured steps.
Open Source Code	Yes	Our code is available at https://github.com/ stair-lab/certified-unlearningneural-networks-icml-2025
Open Datasets	Yes	In this section, we present an empirical evaluation of our proposed unlearning method, in its two variants Gradient Clipping (3) and Model Clipping (4), on two benchmark datasets: MNIST (Deng, 2012) and CIFAR-10 (Krizhevsky et al., 2014). [...] To evaluate our methods in more complex settings, we conducted experiments on CIFAR-100 and CIFAR-10 using Res Net architectures (He et al., 2016) pretrained on public data (Image Net (Deng et al., 2009)).
Dataset Splits	Yes	In both cases, the forget set consists of a randomly selected 10% subset of the full dataset.
Hardware Specification	No	The paper does not provide specific details about the hardware used for experiments, such as GPU models, CPU types, or memory specifications. It only mentions training neural networks without specifying the underlying hardware infrastructure.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies, libraries, or programming languages used in the experiments. It only mentions general concepts like 'stochastic gradient descent (SGD)' and 'Res Net architectures'.
Experiment Setup	Yes	For MNIST, we train a small neural network with two layers and approximately 4,000 parameters. For CIFAR-10, we use a slightly larger network with two convolutional blocks followed by a linear layer, totaling 20,000 parameters. [...] All training, unlearning, and fine-tuning phases use stochastic gradient descent (SGD) with a constant step size. Further experimental details are provided in the appendix. (See Tables 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 in Appendix B for detailed hyperparameters).