RESTOR: Knowledge Recovery in Machine Unlearning
Authors: Keivan Rezaei, Khyathi Chandu, Soheil Feizi, Yejin Choi, Faeze Brahman, Abhilasha Ravichander
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this work, we propose the RESTOR framework for machine unlearning evaluation, which assesses the ability of unlearning algorithms for targeted data erasure, by evaluating the ability of models to forget the knowledge introduced in these datapoints, while simultaneously recovering the model s knowledge state had it never encountered these datapoints. RESTOR helps uncover several novel insights about popular unlearning algorithms, and the mechanisms through which they operate for instance, identifying that some algorithms merely emphasize forgetting but not recovering knowledge, and that localizing unlearning targets can enhance unlearning performance.1 |
| Researcher Affiliation | Collaboration | Keivan Rezaei EMAIL University of Maryland Khyathi Chandu EMAIL Mistral AI Soheil Feizi EMAIL University of Maryland Yejin Choi EMAIL Stanford University Faeze Brahman EMAIL Allen Institute for AI Abhilasha Ravichander EMAIL University of Washington |
| Pseudocode | No | The paper describes algorithms like Gradient Ascent, KL Divergence, and Negative Preference Optimization using mathematical formulations and objective functions, but does not present them in structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code/data is available at github.com/k1rezaei/restor. |
| Open Datasets | Yes | We collect F, a set of 1051 facts about 50 famous individuals, from Wikidata. For the retain set, we use a subset of C4 (Raffel et al., 2020) and use cross-entropy of next-token-prediction as the loss function. Specifically, to create the corruption dataset, we utilize the SQu AD dataset (Rajpurkar, 2016) |
| Dataset Splits | Yes | To determine the optimal hyperparameters, we split facts into validation (10%) and test sets (90%), utilizing the validation set for hyperparameter tuning |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or other computer specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'Llama-3 8B' and 'Mistral 7B' as clean models, refers to 'LLaMA-Factory' for finetuning (with a GitHub link in a footnote), and describes using 'LoRA' (Hu et al., 2021), and 'GPT-4' and 'GPT-3.5' for data generation and evaluation. However, it does not provide specific version numbers for any of the software libraries or frameworks used for its methodology, such as LLaMA-Factory or any underlying deep learning frameworks. |
| Experiment Setup | Yes | C.1 Corruption section: model_name_or_path: meta-llama/Meta-Llama-3-8B, stage: pt, do_train: true, finetuning_type: lora, lora_target: all, per_device_train_batch_size: 2, gradient_accumulation_steps: 2, learning_rate: 5.0e-5, num_train_epochs: 5, lr_scheduler_type: cosine, warmup_ratio: 0.1, fp16: true. C.2 Unlearning section provides specific parameters for Gradient Ascent, KL Divergence, and Negative Preference Optimization, including 'num_epochs', 'lr', 'weight_decay', 'gradient_accumulation_steps', and regularization parameter 'lambda'. |