Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]

Deep Unlearning: Fast and Efficient Gradient-free Class Forgetting

Authors: Sangamesh Kodge, Gobinda Saha, Kaushik Roy

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate our algorithm s efficacy on Image Net using a Vision Transformer with only 1.5% drop in retain accuracy compared to the original model while maintaining under 1% accuracy on the unlearned class samples. Furthermore, our algorithm exhibits competitive unlearning performance and resilience against Membership Inference Attacks (MIA). Compared to baselines, it achieves an average accuracy improvement of 1.38% on the Image Net dataset while requiring up to 10 fewer samples for unlearning. Additionally, under stronger MIA attacks on the CIFAR-100 dataset using a Res Net18 architecture, our approach outperforms the best baseline by 1.8%. Our code available on github.1
Researcher Affiliation Academia Sangamesh Kodge EMAIL Elmore Family School of Electrical and Compute Engineering, Purdue University Gobinda Saha EMAIL Elmore Family School of Electrical and Compute Engineering, Purdue University Kaushik Roy EMAIL Elmore Family School of Electrical and Compute Engineering, Purdue University
Pseudocode Yes Algorithm 1 presents the pseudocode of our approach.
Open Source Code Yes Our code available on github.1 1Our code is available at https://github.com/sangamesh-kodge/class_forgetting.
Open Datasets Yes We conduct the class unlearning experiments on CIFAR10, CIFAR100 (Krizhevsky et al., 2009) and Image Net (Deng et al., 2009) datasets.
Dataset Splits Yes Let the training dataset be denoted by Dtrain = {(xi, yi)}Ntrain i=1 consisting of Ntrain training samples where xi represents the network input and yi is the corresponding label. The test dataset be Dtest = {(xi, yi)}Ntest i=1 containing Ntest samples. For a class unlearning task aimed at removing the target class t, the training dataset can be split into two partitions, namely the retain dataset Dtrain_r = {(xi, yi)}N i=1;yi =t and the forget dataset Dtrain_f = {(xi, yi)}N i=1;yi=t. Similarly, the test dataset can be split into these partitions as Dtest_r = {(xi, yi)}N i=1;yi =t and Dtest_f = {(xi, yi)}N i=1;yi=t. ... Table 6: Hyperparameters for our approach with single class unlearning or sequential multi-class unlearning. Dataset alpha_r_list alpha_f_list samples/class in Xr samples/class in Xf CIFAR10 [10, 30, 100, 300, 1000] [3] 100 900 CIFAR100 [100, 300, 1000] [3, 10, 30, 100] 10 990 Image Net [30, 100, 300, 1000, 3000] [3, 10, 30, 100, 300] 1 500
Hardware Specification Yes We utilize implementations from the algorithms official repositories and assess their runtime on a single Nvidia A40 GPU. Our algorithm achieves favorable runtime compared to most iterative methods considered in this work. Notably, the method referenced in Foster et al. (2024) exhibits the fastest runtime in this setting. However, the runtime advantange of Foster et al. (2024) diminishes significantly when compared on an AMD EPYC 7502 32-core CPU.
Software Dependencies No We use the modified versions of Res Net18 (He et al., 2016) and VGG11 (Simonyan & Zisserman, 2014) with batch normalization for the CIFAR10 and CIFAR100 datasets. ... For the Image Net dataset, we use the pretrained VGG11 and base version of Vision Transformer with a patch size of 14 (Dosovitskiy et al., 2020) available in the torchvision library. ... Table 11 shows the results for our algorithm for different versions of Vi T model from the pytorch library. No specific version numbers for PyTorch, torchvision, etc. are mentioned.
Experiment Setup Yes All CIFAR10 and CIFAR100 models are trained for 350 epochs using Stochastic Gradient Descent (SGD) with the learning rate of 0.01. We use Nesterov (Sutskever et al., 2013) accelerated momentum with a value of 0.9 and the weight decay is set to 5e-4. ... Table 6: Hyperparameters for our approach with single class unlearning or sequential multi-class unlearning. Dataset alpha_r_list alpha_f_list samples/class in Xr samples/class in Xf CIFAR10 [10, 30, 100, 300, 1000] [3] 100 900 CIFAR100 [100, 300, 1000] [3, 10, 30, 100] 10 990 Image Net [30, 100, 300, 1000, 3000] [3, 10, 30, 100, 300] 1 500