The Utility and Complexity of In- and Out-of-Distribution Machine Unlearning
Authors: Youssef Allouah, Joshua Kazdan, Rachid Guerraoui, Sanmi Koyejo
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Figure 1: Numerical validation on a linear regression task with synthetic data for the same unlearning budget, with in-distribution (left) and out-of-distribution (right) data. The in-distribution forget set is sampled at random, while the out-of-distribution data is obtained by shifting labels with a fixed offset. Additional details and results on real data can be found in Appendix F. |
| Researcher Affiliation | Academia | Youssef Allouah1 , Joshua Kazdan2, Rachid Guerraoui1, Sanmi Koyejo2 1EPFL, Switzerland 2Stanford University, USA EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1 Unlearning via Noisy Minimizer Approximation Algorithm 2 Unlearning via Robust Training and Noisy Minimizer Approximation |
| Open Source Code | No | The paper does not explicitly state that source code for its methodology is made available. It mentions using "the DP-SGD implementation of Opacus (Yousefpour et al., 2021)" which refers to a third-party library, not their own code. |
| Open Datasets | Yes | We extend the empirical validation in Figure 1 in the out-of-distribution scenario to the California Housing dataset (Pace and Barry, 1997), a standard regression benchmark. |
| Dataset Splits | No | The paper describes how forget data is sampled from a larger set of training samples (e.g., "f = 20 forget data out of 10,000 samples", "f {1, 0.1n, 0.45n} forget data out of 1,000 samples"), and for the California Housing dataset it states "The dataset contains 20,640 samples". However, it does not provide specific train/test/validation splits (e.g., percentages or exact counts) for the overall datasets used to initially train the models, nor does it reference standard splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions using "the DP-SGD implementation of Opacus (Yousefpour et al., 2021)" but does not specify version numbers for Opacus or any other software components. |
| Experiment Setup | Yes | The full data features are generated from a d-dimensional Gaussian N(0, Id), d = 100, and the labels are generated from the features and a random true underlying model, also from a Gaussian N(0, Id), with a Gaussian response. The in-distribution forget data consists of f = 20 data points sampled at random from the 10,000 full training samples. Moreover, we set the unlearning budget for the in-distribution scenario to ε = 1. For the out-of-distribution scenario, the forget data is obtained by shifting labels with a fixed offset set to 103, the total number of training samples being 1,000. Moreover, we set the unlearning budget for the in-distribution scenario to ε = 10. The learning rates for Algorithm 1 and 2 are set following standard theoretical convergence rates (Nesterov et al., 2018) for strongly convex tasks, and Theorem 3. We use the DP-SGD implementation of Opacus (Yousefpour et al., 2021), ... and run the optimizer until convergence or for 100 epochs (usually 10 more than Algorithm 1) with a fine-tuned learning rate. |