Selective Unlearning via Representation Erasure Using Domain Adversarial Training

Authors: Nazanin Sepahvand, Eleni Triantafillou, Hugo Larochelle, Doina Precup, Jim Clark, Dan Roy, Gintare Karolina Dziugaite

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive empirical evaluation, we demonstrate that SURE not only achieves a superior trade-off between unlearning effectiveness and model utility but also exhibits reduced vulnerability to representation-space MIAs. In this section, we present our findings showing that our method not only achieves a superior tradeoff between performance and unlearning efficiency compared to existing approaches but also addresses some of the limitations of current unlearning techniques. Specifically, our method mitigates the instability issues seen in previous methods, where results can change significantly with minor adjustments to the unlearning setup-compared to the one used for hyperparameter tuning (Fan et al., 2023). In addition, SURE reduces vulnerabilities to basic membership inference attacks in the representation space that are exhibited by some other methods. We assess our method across three distinct unlearning scenarios: random unlearning, where forget samples are randomly selected from the entire training set; partial class unlearning, where the forget set (Df) is a randomly chosen subset of a specific class; and class unlearning, where the forget set (Df) comprises all samples from a particular class, with the primary focus on the partial class unlearning scenario.
Researcher Affiliation Collaboration 1Mc Gill University, Mila, Canada 2Google Deep Mind 3University of Toronto, Canada
Pseudocode No The paper describes the methodology using textual explanations and mathematical equations (e.g., Section 4, Appendix C) but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No To ensure reproducibility, we provide a comprehensive outline of our experimental setup in Section 5.1, including details on hyperparameters, hyperparameter tuning strategies, datasets, and algorithms. The precise hyperparameter settings are further documented in the Appendix G. Our stability experiments, as detailed in Section 5.2, significantly enhance reproducibility and reveal inherent instability in some previously proposed methods. Furthermore, we address a critical and often overlooked aspect of unlearning algorithms in Section 5.2 by carefully outlining our stopping criteria selection process. Finally, all average numerical values in our experimental results are presented with error bars computed over 5 independent runs.
Open Datasets Yes More specifically, we evaluate our method on two additional datasets: CINIC-10 (Darlow et al., 2018), an extension of CIFAR-10 augmented with samples from Image Net, and Tiny Image Net (Le & Yang, 2015) in Appendix D.
Dataset Splits Yes The source training dataset, D = (xi, yi)N i=1, comprises N i.i.d. samples drawn from the data distribution. During unlearning, D is partitioned into two mutually exclusive subsets: a forget set Df, containing samples for which the model must unlearn information; and retain set Dr = D D f. In addition, we assume access to a validation set Dval. For simplicity of presentation, this validation set is assumed to be of the same size as the forget set. ... For the random unlearning method, hyperparameter tuning was carried out with a forget set comprising 5,000 samples. In contrast, for (partial) class unlearning, the forget set was considered to contain 500 samples. ... In the random unlearning scenario, the forget set consists of 1,000 samples randomly selected from the entire training set, while in the partial unlearning scenario, it consists of 100 samples randomly chosen from class 5.
Hardware Specification No ACKNOWLEDGEMENT We thank Mila Quebec AI Institute and Google Deep Mind for providing the computational resources that supported this work. The paper mentions 'computational resources' but does not provide specific hardware details such as GPU or CPU models.
Software Dependencies No The paper mentions 'ResNet-18' as the backbone for the feature extractor module, which is a model architecture, but does not specify any software libraries or their version numbers used for implementation.
Experiment Setup Yes Implementation Details: Extensive hyperparameter tuning was conducted independently for each method. The primary hyperparameters optimized were the learning rate and α, where α controls the weight of the retain set loss in methods with a weighted loss function that combines losses for the retain and forget sets. Additional hyperparameters, including batch size for the forget set and learning rate scheduling, were also tuned. Optimal hyperparameters for each model were determined through Bayesian optimization to achieve the best unlearning to utility trade-off. The same original model was used across all unlearning methods, trained for 150 epochs with a learning rate of 0.01 and a weight decay of 0.0005, without data augmentation, with the learning rate reduced by an order of magnitude at epochs 80 and 150. ... The optimized values for all hyperparameters across unlearning methods in both random and partial class unlearning scenarios are presented in Table 3 and Table 4, respectively.