Textual Unlearning Gives a False Sense of Unlearning
Authors: Jiacheng Du, Zhibo Wang, Jie Zhang, Xiaoyi Pang, Jiahui Hu, Kui Ren
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive evaluations, our findings critically reveal that textual unlearning actually gives a false sense of unlearning! Existing methods for textual unlearning fail to completely erase unlearned texts, and their deployment will instead introduce heightened privacy risks. Specifically, we highlight the following key findings: Unlearned texts remain detectable with high confidence on the unlearned LMs. ... We present the results on Pythia-1.4b and Synth PAI-age dataset in Table 1, and results in other settings are provided in Appendix E. |
| Researcher Affiliation | Academia | 1The State Key Laboratory of Blockchain and Data Security, Zhejiang University, P. R. China 2School of Cyber Science and Technology, Zhejiang University, P. R. China 3ETH Zurich, Switzerland. Correspondence to: Zhibo Wang <EMAIL>. |
| Pseudocode | Yes | The detailed process of U-Li RA+ is outlined in Algorithm 1. To rigorously audit the unlearning algorithm U, we first sample an audit dataset Daudit from the training data distribution and mislabel it. ... Algorithm 2 TULA-MI in the Strict Case ... Algorithm 3 TULA-DR ... Algorithm 4 TULA-MI in the Relaxed Case |
| Open Source Code | No | The text does not contain an explicit statement about releasing the source code or a link to a repository for the methodology described in this paper. |
| Open Datasets | Yes | To ensure rigorous evaluation, we construct two synthetic datasets derived from the Synth PAI dataset (Yukhymenko et al., 2024), enabling fine-tuning on previously unseen data. |
| Dataset Splits | No | The paper mentions 'Train ACC and Test ACC' in Table 4, indicating train/test splits were used, but it does not specify the exact percentages, sample counts, or methodology for these splits for the main datasets (Synth PAI-age and Synth PAI-inc) used to fine-tune the LMs. It only details how audited samples are divided into unlearned and unseen parts. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU models, memory specifications) used to run the experiments. |
| Software Dependencies | No | The paper mentions software components like 'Adam W optimizer' and 'Light GBM classifier' but does not provide specific version numbers for these or other key software dependencies (e.g., Python, PyTorch, TensorFlow, CUDA) required for replication. |
| Experiment Setup | Yes | We utilize the Adam W optimizer (Loshchilov, 2017) to full-parameter fine-tune the model to obtain Moriginal, with the learning rate set to 1e-5, the batch size to 64, and the number of training rounds to 2. ... For TULA-DR, the learning rate α is set to 0.1 for the Pythia-1.4b model and 0.45 for the OPT-1.3b model, and the regularization factor β is fixed at 0.1. |