Towards Effective Evaluations and Comparisons for LLM Unlearning Methods
Authors: Qizhou Wang, Bo Han, Puning Yang, Jianing ZHU, Tongliang Liu, Masashi Sugiyama
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our evaluation framework notably enhances the effectiveness when assessing and comparing various LLM unlearning methods, further allowing us to benchmark existing works, identify their proper hyper-parameters, and explore new tricks to enhance their practical efficacy. The code is publicly available at: https://github.com/tmlr-group/Unlearning-with-Control. ... 6 EXPERIMENTS We benchmark existing LLM unlearning methods using UWC, recommending their proper hyperparameters, assessing and comparing their efficacy in achieving effective unlearning. ... We report not only the ES scores for original data but also for the associated paraphrased versions provided by TOFU. |
| Researcher Affiliation | Academia | 1TMLR Group, Department of Computer Science, Hong Kong Baptist University 2RIKEN Center for Advanced Intelligence Project 3Sydney AI Center, The University of Sydney 4The University of Tokyo |
| Pseudocode | Yes | Algorithm 1 Binary Search for MM Calibration |
| Open Source Code | Yes | The code is publicly available at: https://github.com/tmlr-group/Unlearning-with-Control. |
| Open Datasets | Yes | Our evaluations were based on the well-established benchmarks of TOFU fictitious unlearning (Maini et al., 2024), focusing on LLMs fine-tuned with a series of fictitious authors profiles. |
| Dataset Splits | Yes | For the unlearning setups, the original TOFU data were separated into targeted and non-targeted parts, of which the adopted proportions are 1:99 (1% unlearning), 5:95 (5% unlearning), and 10:90 (10% unlearning). |
| Hardware Specification | Yes | All our experiments were realized by Transformers 4.42.4 with CUDA 12.1, using a series of computation nodes equipped with NVIDIA-A100-80GB GPUs and Intel(R) Xeon(R) Gold 6248R CPU @ 3.00GHz Processors. |
| Software Dependencies | Yes | All our experiments were realized by Transformers 4.42.4 with CUDA 12.1, using a series of computation nodes equipped with NVIDIA-A100-80GB GPUs and Intel(R) Xeon(R) Gold 6248R CPU @ 3.00GHz Processors. |
| Experiment Setup | Yes | For all the considered methods, we adopt the following implementation setups: the Adam W optimizer (Loshchilov & Hutter, 2017), the initial learning rate 2e 5 for Phi-1.5 and 1e 5 for Llama-2-7B, the batch size 16 for both the targeted and non-targeted data, the epoch number 5, and the linear warm-up for the first epoch. For MM calibration, we set τ = 0.95 for Phi-1.5 and τ = 0.90 for Llama-2-7B. |