System-Aware Unlearning Algorithms: Use Lesser, Forget Faster

Authors: Linda Lu, Ayush Sekhari, Karthik Sridharan

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental B. Experimental Evaluation Our theoretical results provide guarantees for the worst case deletions. We experimentally verify our theory, and we demonstrate that in practice, Algorithm 1 can maintain small excess error beyond the core set deletion capacities proven in Theorem 4.1. Furthermore, Algorithm 1 is significantly more memory and computation time efficient compared to other unlearning methods for linear classification.
Researcher Affiliation Academia 1Cornell University 2Boston University. Correspondence to: Linda Lu <EMAIL>.
Pseudocode Yes Algorithm 1 System-Aware Unlearning Algorithm for Linear Classification using Selective Sampling; Algorithm 2 System-Aware Unlearning Algorithm for General Classification using Selective Sampling
Open Source Code No The paper does not provide an explicit statement about open-sourcing code or a link to a code repository.
Open Datasets Yes Purchase Dataset: A binary classification dataset on item purchase data curated by Bourtoule et al. (2021) with 249,215 points in dimension d = 600. Margin Dataset: A synthetic binary classification dataset with 200,000 points in dimension d = 100 with a hard margin condition of γ = 0.1 ( u x > 0.1, x for some underlying u Rd).
Dataset Splits No The paper mentions comparing 'test accuracy' but does not provide specific details on how the datasets were split into training, validation, or test sets.
Hardware Specification No The paper does not provide specific details about the hardware used for experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper does not provide specific details about software dependencies with version numbers used in the experiments.
Experiment Setup No The experimental evaluation section describes the unlearning procedures and datasets used, but it does not provide specific hyperparameter values (e.g., learning rate, batch size, or a concrete value for the sampling parameter κ) or detailed system-level training settings required for reproduction.