Provable unlearning in topic modeling and downstream tasks
Authors: Stanley Wei, Sadhika Malladi, Sanjeev Arora, Amartya Sanyal
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we provide the first theoretical guarantees for unlearning in the pre-training and fine-tuning paradigm by studying topic models, simple bag-of-words language models that can be adapted to solve downstream tasks like retrieval and classification. First, we design a provably effective unlearning algorithm for topic models that incurs a computational overhead independent of the size of the original dataset. Our analysis additionally quantifies the deletion capacity of the model i.e., the number of examples that can be unlearned without incurring a significant cost in model performance. Finally, we formally extend our analyses to account for adaptation to a given downstream task. |
| Researcher Affiliation | Academia | Stanley Wei , Sadhika Malladi , Sanjeev Arora Princeton University EMAIL Amartya Sanyal University of Copenhagen EMAIL |
| Pseudocode | Yes | Algorithm 1 Unlearning algorithm (Ubase) ... Algorithm 2 Unlearning algorithm for task T (Uhead) ... Algorithm 3 High level learning algorithm (A) ... Algorithm 4 Recover Anchors ... Algorithm 5 Recover Topics |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating that source code for the described methodology is publicly available or released in supplementary materials. |
| Open Datasets | No | The paper refers to a 'corpus of documents' and a 'dataset of m documents' generically. While it mentions the context of topic modeling and references Blei et al. (2003) for LDA, it does not provide specific access information (links, DOIs, repository names, or formal citations with authors and years for specific datasets) for any publicly available dataset used. |
| Dataset Splits | No | The paper is theoretical and focuses on algorithms and guarantees for machine unlearning. It discusses a 'training set S' and 'forget set Sf' but does not specify any training/test/validation dataset splits, their percentages, sample counts, or methodologies for empirical evaluation. |
| Hardware Specification | No | The paper is theoretical in nature, detailing algorithms and proofs for machine unlearning. It does not include an experimental section and therefore does not provide any specific hardware specifications such as GPU or CPU models, or details about the computing environment. |
| Software Dependencies | No | The paper describes theoretical algorithms and their guarantees. It does not mention any specific software dependencies, libraries, frameworks, or their version numbers that would be required to implement the described methods. |
| Experiment Setup | No | The paper focuses on theoretical analysis, presenting algorithms and proofs rather than empirical experiments. Consequently, it does not include details on experimental setup, hyperparameters, model initialization, or training configurations. |