A Certified Unlearning Approach without Access to Source Data

Authors: Umit Yigit Basaran, Sk Miraj Ahmed, Amit Roy-Chowdhury, Basak Guler

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We establish theoretical bounds, introduce practical noise calibration techniques, and validate our method through extensive experiments on both synthetic and real-world datasets. The results demonstrate the effectiveness and reliability of our approach in privacy-sensitive settings.
Researcher Affiliation Academia 1Department of Electrical and Computer Engineering, University of California, Riverside, CA, USA 2Brookhaven National Laboratory, Upton, NY, USA.
Pseudocode Yes Algorithm 1 Unlearning Mechanism Leveraging Surrogate Data Statistics
Open Source Code Yes Our main implementation used for this paper is available at https://github.com/info-ucr/ certified-unlearning-surr-data. We also implemented the mixed-linear networks (Golatkar et al., 2021) from scratch, the code is available at https://github. com/info-ucr/mixed-privacy-forgetting.
Open Datasets Yes We further evaluate our method on CIFAR10 (Krizhevsky et al., 2009), Caltech256 (Griffin et al., 2007), and Stanford Dogs (Khosla et al., 2011)... MNIST (Lecun et al., 1998) and USPS (Hull, 1994) datasets
Dataset Splits Yes Unless otherwise noted, we adopt a linear training model with forget ratio of 0.1... We evaluate the performance using train, test, retain, and forget accuracies on their respective data splits... In both cases, our method achieves effective certified unlearning and maintains competitive accuracy on the retained data, demonstrating that mixed linear networks provide a practical and theoretically sound foundation for unlearning in neural models. MIA scores are omitted for class unlearning because the attack is designed to distinguish between test and forget samples; forgetting an entire class greatly increases distinguishability, making the MIA score uninformative.
Hardware Specification No The paper does not provide specific hardware details used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details with version numbers.
Experiment Setup Yes Unless stated otherwise, we use a linear training model with privacy parameters ϵ = 5e3 and δ = 1, a forget ratio of 0.1, and an L2 regularization constant of λ = 0.01... We set α = 1+λ, L = 1, β = 1 and γ = 1. For the sampling from the marginal distribution of the exact data, we used Stochastic Gradient Langevin Dynamics (SGLD) with step size 0.02 and generate 1000 samples. For each sample random update is applied 4000 iteration for each generated sample. After sampling done to estimate the KL divergence via Donsker Varadhan variational bound, we trained a a network with three linear layers for a 500 epochs with learning rate 0.0001 using Adam optimizer.