RESTOR: Knowledge Recovery in Machine Unlearning

Authors: Keivan Rezaei, Khyathi Chandu, Soheil Feizi, Yejin Choi, Faeze Brahman, Abhilasha Ravichander

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this work, we propose the RESTOR framework for machine unlearning evaluation, which assesses the ability of unlearning algorithms for targeted data erasure, by evaluating the ability of models to forget the knowledge introduced in these datapoints, while simultaneously recovering the model s knowledge state had it never encountered these datapoints. RESTOR helps uncover several novel insights about popular unlearning algorithms, and the mechanisms through which they operate for instance, identifying that some algorithms merely emphasize forgetting but not recovering knowledge, and that localizing unlearning targets can enhance unlearning performance.1
Researcher Affiliation Collaboration Keivan Rezaei EMAIL University of Maryland Khyathi Chandu EMAIL Mistral AI Soheil Feizi EMAIL University of Maryland Yejin Choi EMAIL Stanford University Faeze Brahman EMAIL Allen Institute for AI Abhilasha Ravichander EMAIL University of Washington
Pseudocode No The paper describes algorithms like Gradient Ascent, KL Divergence, and Negative Preference Optimization using mathematical formulations and objective functions, but does not present them in structured pseudocode or algorithm blocks.
Open Source Code Yes Code/data is available at github.com/k1rezaei/restor.
Open Datasets Yes We collect F, a set of 1051 facts about 50 famous individuals, from Wikidata. For the retain set, we use a subset of C4 (Raffel et al., 2020) and use cross-entropy of next-token-prediction as the loss function. Specifically, to create the corruption dataset, we utilize the SQu AD dataset (Rajpurkar, 2016)
Dataset Splits Yes To determine the optimal hyperparameters, we split facts into validation (10%) and test sets (90%), utilizing the validation set for hyperparameter tuning
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or other computer specifications used for running the experiments.
Software Dependencies No The paper mentions using 'Llama-3 8B' and 'Mistral 7B' as clean models, refers to 'LLaMA-Factory' for finetuning (with a GitHub link in a footnote), and describes using 'LoRA' (Hu et al., 2021), and 'GPT-4' and 'GPT-3.5' for data generation and evaluation. However, it does not provide specific version numbers for any of the software libraries or frameworks used for its methodology, such as LLaMA-Factory or any underlying deep learning frameworks.
Experiment Setup Yes C.1 Corruption section: model_name_or_path: meta-llama/Meta-Llama-3-8B, stage: pt, do_train: true, finetuning_type: lora, lora_target: all, per_device_train_batch_size: 2, gradient_accumulation_steps: 2, learning_rate: 5.0e-5, num_train_epochs: 5, lr_scheduler_type: cosine, warmup_ratio: 0.1, fp16: true. C.2 Unlearning section provides specific parameters for Gradient Ascent, KL Divergence, and Negative Preference Optimization, including 'num_epochs', 'lr', 'weight_decay', 'gradient_accumulation_steps', and regularization parameter 'lambda'.