Data Unlearning in Diffusion Models

Authors: Silas Alberti, Kenan Hasanaliyev, Manav Shah, Stefano Ermon

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental When evaluated on Celeb A-HQ and MNIST, SISS achieved Pareto optimality along the quality and unlearning strength dimensions. On Stable Diffusion, SISS successfully mitigated memorization on nearly 90% of the prompts we tested. We evaluate our SISS method, its ablations, Erase Diff, Neg Grad, and naive deletion through unlearning experiments on Celeb A-HQ, MNIST T-Shirt, and Stable Diffusion.
Researcher Affiliation Academia Silas Alberti Kenan Hasanaliyev Manav Shah Stefano Ermon Stanford University EMAIL
Pseudocode No The paper defines loss functions and describes the method's mathematical properties, but it does not include a distinct pseudocode block or algorithm section.
Open Source Code Yes We release our code online.1 1https://github.com/claserken/SISS
Open Datasets Yes We demonstrate the effectiveness of SISS on Celeb A-HQ (Karras et al., 2018), MNIST with T-Shirt, and Stable Diffusion. The base model for MNIST with T-Shirt was trained on MNIST (Deng, 2012) augmented with a specific T-shirt from Fashion-MNIST (Xiao et al., 2017). Stable Diffusion v1.4 drawn from Webster (2023). LAION
Dataset Splits No The paper describes how datasets were augmented or synthetically generated for specific experiments (e.g., "sampling 128 images for each prompt and using a k-means classifier for labelling each image as memorized (A) or not (X \ A)"), and how fine-tuning was performed. However, it does not provide explicit training/test/validation splits for the unlearning experiments themselves.
Hardware Specification Yes a cluster of 8 NVIDIA H100 GPUs were used to execute large numbers of runs in parallel. In addition, an g5.xlarge instance with an NVIDIA A10G GPU on AWS, a personal home computer with an NVIDIA RTX 3090, and a cluster of 3 NVIDIA A4000 GPUs were the primary code development environments.
Software Dependencies No All diffusion models were trained and fine-tuned using the Hugging Face diffusers package along with the Adam optimizer (Kingma & Ba, 2015). The paper mentions software packages like 'Hugging Face diffusers' and 'Adam optimizer' but does not specify their version numbers.
Experiment Setup Yes Our pretrain and retrain unconditional MNIST T-Shirt DDPMs were trained for 250 epochs with a batch size of 128 images and a learning rate of 1e 4 with cosine decay. Both models used the same DDPM sampler at inference with 50 backwards steps. In the case of Celeb A-HQ and Stable Diffusion, we did not perform the pretraining and chose a batch size of 64 and 16 images with a learning rate of 5e 6 and 1e 5, respectively.