Data Unlearning in Diffusion Models
Authors: Silas Alberti, Kenan Hasanaliyev, Manav Shah, Stefano Ermon
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | When evaluated on Celeb A-HQ and MNIST, SISS achieved Pareto optimality along the quality and unlearning strength dimensions. On Stable Diffusion, SISS successfully mitigated memorization on nearly 90% of the prompts we tested. We evaluate our SISS method, its ablations, Erase Diff, Neg Grad, and naive deletion through unlearning experiments on Celeb A-HQ, MNIST T-Shirt, and Stable Diffusion. |
| Researcher Affiliation | Academia | Silas Alberti Kenan Hasanaliyev Manav Shah Stefano Ermon Stanford University EMAIL |
| Pseudocode | No | The paper defines loss functions and describes the method's mathematical properties, but it does not include a distinct pseudocode block or algorithm section. |
| Open Source Code | Yes | We release our code online.1 1https://github.com/claserken/SISS |
| Open Datasets | Yes | We demonstrate the effectiveness of SISS on Celeb A-HQ (Karras et al., 2018), MNIST with T-Shirt, and Stable Diffusion. The base model for MNIST with T-Shirt was trained on MNIST (Deng, 2012) augmented with a specific T-shirt from Fashion-MNIST (Xiao et al., 2017). Stable Diffusion v1.4 drawn from Webster (2023). LAION |
| Dataset Splits | No | The paper describes how datasets were augmented or synthetically generated for specific experiments (e.g., "sampling 128 images for each prompt and using a k-means classifier for labelling each image as memorized (A) or not (X \ A)"), and how fine-tuning was performed. However, it does not provide explicit training/test/validation splits for the unlearning experiments themselves. |
| Hardware Specification | Yes | a cluster of 8 NVIDIA H100 GPUs were used to execute large numbers of runs in parallel. In addition, an g5.xlarge instance with an NVIDIA A10G GPU on AWS, a personal home computer with an NVIDIA RTX 3090, and a cluster of 3 NVIDIA A4000 GPUs were the primary code development environments. |
| Software Dependencies | No | All diffusion models were trained and fine-tuned using the Hugging Face diffusers package along with the Adam optimizer (Kingma & Ba, 2015). The paper mentions software packages like 'Hugging Face diffusers' and 'Adam optimizer' but does not specify their version numbers. |
| Experiment Setup | Yes | Our pretrain and retrain unconditional MNIST T-Shirt DDPMs were trained for 250 epochs with a batch size of 128 images and a learning rate of 1e 4 with cosine decay. Both models used the same DDPM sampler at inference with 50 backwards steps. In the case of Celeb A-HQ and Stable Diffusion, we did not perform the pretraining and chose a batch size of 64 and 16 images with a learning rate of 5e 6 and 1e 5, respectively. |