Adaptive Localization of Knowledge Negation for Continual LLM Unlearning

Authors: Abudukelimu Wuerkaixi, Qizhou Wang, Sen Cui, Wutong Xu, Bo Han, Gang Niu, Masashi Sugiyama, Changshui Zhang

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive experiments across three well-established LLM unlearning datasets demonstrate that our approach consistently outperforms baseline methods in both unlearning effectiveness and utility retention under continual unlearning settings. We conducted comprehensive experiments to validate the effectiveness of our proposed method. Experiments are performed across multiple models and datasets, demonstrating that ALKN effectively retains the general utility of the model while achieving effective unlearning of sequential tasks.
Researcher Affiliation Academia 1Institute for Artificial Intelligence, Tsinghua University (THUAI); Beijing National Research Center for Information Science and Technology (BNRist); Department of Automation, Tsinghua University, Beijing, P.R.China 2TMLR Group, Department of Computer Science, Hong Kong Baptist University 3RIKEN 4The University of Tokyo.
Pseudocode Yes Algorithm 1 Continual unlearning method ALKN
Open Source Code Yes 1Code is available at: https://github.com/ zaocan666/ALKN
Open Datasets Yes We conduct experiments on three datasets: TOFU, MUSE News, and WHP, each representing scenarios relevant to real-world applications such as privacy preservation and copyright protection. 1) TOFU (Maini et al., 2024) dataset consists of generated fictional author information. 2) MUSE News (Shi et al., 2024b) comprises BBC news articles... 3) WHP (Who s Harry Potter) (Eldan & Russinovich, 2023; Shi et al., 2024b) involves the unlearning of original text from the Harry Potter series.
Dataset Splits Yes To simulate a continual unlearning setting, we divid the data designated for unlearning into multiple disjoint subsets based on the structure of the datasets. These subsets, either from different parts of the same dataset or a mix of datasets, are treated as sequential unlearning tasks. ... 1) TOFU (Maini et al., 2024) dataset consists of generated fictional author information. We group the information of four authors into one unlearning task, resulting in five or more continual unlearning tasks. ... 2) MUSE News (Shi et al., 2024b) comprises BBC news articles and includes four predefined unlearning subsets that serve as continual unlearning tasks. 3) WHP (Who s Harry Potter) (Eldan & Russinovich, 2023; Shi et al., 2024b) involves the unlearning of original text from the Harry Potter series. We treat the original text of three Harry Potter books as three separate unlearning tasks. ... To safeguard the general utility of the model throughout this iterative process, a retain set Dr = {xi r}i [nr] can be employed to mitigate unintended degradation of utility.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The Adam optimizer is employed for the training of all methods. (Appendix E.1) --- No specific version number is provided for the Adam optimizer or any other software dependencies.
Experiment Setup Yes The Adam optimizer is employed for the training of all methods. We employ the learning rate of 3e-5 for the TOFU dataset and 3e-6 for the MUSE News and WHP datasets. ... The hyperparameters s and λ of our method are set as 10 and 0.8. µ is set as the deviation of the unlearning gradients Gu. mt is initialized with zero vectors in the beginning of each training. Weight decay ratio is set as 0.01 for all experiments. For TOFU dataset, batch size is set as 32 and models are trained for 5 epochs. For MUSE News dataset, batch size is 6 and number of epochs is 2. For WHP dataset, batch size and epoch number are 6 and 10.