Rethinking LLM Unlearning Objectives: A Gradient Perspective and Go Beyond
Authors: Qizhou Wang, Jin Zhou, (Andrew) Zhanke Zhou, Saebyeol Shin, Bo Han, Kilian Weinberger
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We benchmark both existing and new methods explored throughout our analysis on the wellestablished TOFU fictitious unlearning datasets (Maini et al., 2024). Our experiments identify several new state-of-the-arts that merit further attention. Additionally, based on our analysis, we highlight promising research directions that warrant exploration to further advance the field. |
| Researcher Affiliation | Academia | 1TMLR Group, Department of Computer Science, Hong Kong Baptist University 2Department of Computer Science, Cornell University |
| Pseudocode | No | The paper describes methods using mathematical formulations and textual descriptions, but no explicit pseudocode or algorithm blocks are provided. |
| Open Source Code | Yes | The code is publicly available at: https://github.com/tmlr-group/G-effect. |
| Open Datasets | Yes | We benchmark both existing and new methods explored throughout our analysis on the wellestablished TOFU fictitious unlearning datasets (Maini et al., 2024). |
| Dataset Splits | Yes | For the unlearning setups, the original TOFU data are separated into targeted and non-targeted parts, of which the adopted proportions are 1:99 (1% unlearning), 5:95 (5% unlearning), and 10:90 (10% unlearning). Moreover, we separate 400 non-targeted data that are not involved during the unlearning procedure for evaluations |
| Hardware Specification | Yes | Moreover, our experiments are conducted on computation nodes equipped with NVIDIA-A100-80GB GPUs and Intel(R) Xeon(R) Gold 6248R CPUs. |
| Software Dependencies | Yes | The systems utilize Transformers version 4.42.4 and CUDA version 12.1. |
| Experiment Setup | Yes | We default to apply the following settings: the Adam W optimizer (Loshchilov & Hutter, 2017), a batch size of 16, a maximal gradient norm of 1, and the (un)learning rate of 2e 5 for Phi-1.5 and 1e 5 for Llama-2-7b with linear warm-up for the first epoch. Each method is executed over a total of 5 epochs. Moreover, the model-specific hyper-parameters after fine-tuning are as follows: For the 1% and 5% setups, we set α = 5 for WGA; β = 0.5 for NPO; β = 4 for TNPO; α = 1.5 and β = 5 for WTNPO. For the 10% setup, we set α = 7 for WGA; β = 0.5 for NPO; β = 5 for TNPO; α = 1.5 and β = 7 for WTNPO. For RMU, we set the 9-th layer with c = 4 for Phi-1.5 and the 21-th layer with c = 2 for Llama-2-7B. |