Provable unlearning in topic modeling and downstream tasks

Authors: Stanley Wei, Sadhika Malladi, Sanjeev Arora, Amartya Sanyal

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this paper, we provide the first theoretical guarantees for unlearning in the pre-training and fine-tuning paradigm by studying topic models, simple bag-of-words language models that can be adapted to solve downstream tasks like retrieval and classification. First, we design a provably effective unlearning algorithm for topic models that incurs a computational overhead independent of the size of the original dataset. Our analysis additionally quantifies the deletion capacity of the model i.e., the number of examples that can be unlearned without incurring a significant cost in model performance. Finally, we formally extend our analyses to account for adaptation to a given downstream task.
Researcher Affiliation Academia Stanley Wei , Sadhika Malladi , Sanjeev Arora Princeton University EMAIL Amartya Sanyal University of Copenhagen EMAIL
Pseudocode Yes Algorithm 1 Unlearning algorithm (Ubase) ... Algorithm 2 Unlearning algorithm for task T (Uhead) ... Algorithm 3 High level learning algorithm (A) ... Algorithm 4 Recover Anchors ... Algorithm 5 Recover Topics
Open Source Code No The paper does not contain any explicit statements or links indicating that source code for the described methodology is publicly available or released in supplementary materials.
Open Datasets No The paper refers to a 'corpus of documents' and a 'dataset of m documents' generically. While it mentions the context of topic modeling and references Blei et al. (2003) for LDA, it does not provide specific access information (links, DOIs, repository names, or formal citations with authors and years for specific datasets) for any publicly available dataset used.
Dataset Splits No The paper is theoretical and focuses on algorithms and guarantees for machine unlearning. It discusses a 'training set S' and 'forget set Sf' but does not specify any training/test/validation dataset splits, their percentages, sample counts, or methodologies for empirical evaluation.
Hardware Specification No The paper is theoretical in nature, detailing algorithms and proofs for machine unlearning. It does not include an experimental section and therefore does not provide any specific hardware specifications such as GPU or CPU models, or details about the computing environment.
Software Dependencies No The paper describes theoretical algorithms and their guarantees. It does not mention any specific software dependencies, libraries, frameworks, or their version numbers that would be required to implement the described methods.
Experiment Setup No The paper focuses on theoretical analysis, presenting algorithms and proofs rather than empirical experiments. Consequently, it does not include details on experimental setup, hyperparameters, model initialization, or training configurations.