Concept Siever : Towards Controllable Erasure of Concepts from Diffusion Models without Side-effect

Authors: Aakash Kumar Singh, Priyam Dey, Sribhav Srivatsa, Venkatesh Babu Radhakrishnan

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We report state-of-the-art performance on the I2P benchmark, surpassing previous domain-agnostic methods by over 33% while showing superior structure preservation. We validate our results through extensive quantitative and qualitative evaluation along with a user study.
Researcher Affiliation Academia Vision and AI Lab, Department of Computational and Data Sciences, IISc Bangalore
Pseudocode No The paper describes its methodology using textual explanations and mathematical equations, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not contain an explicit statement about releasing its own source code, nor does it provide a link to a code repository for the described methodology. A footnote mentions 'The implementation is adapted from this repository' which refers to a third-party resource, not their own implementation for Concept Siever.
Open Datasets Yes We evaluate Concept Siever on three datasets: I2P Benchmark which consists of NSFW content (Schramowski et al., 2023), Celebrity Identity (Heng & Soh, 2024) and Artistic Style (Gandikota et al., 2023). We also evaluate on MSCOCO-30K (Lin et al., 2014).
Dataset Splits No The paper mentions evaluating on '2500 images related to 100 artists, and efficacy by evaluating on 250 images on the forgotten concept' for artistic style, and states it 'follows MACE s evaluation protocol for benchmarking on I2P dataset', but does not provide explicit training, validation, or test dataset splits needed for reproduction.
Hardware Specification No The paper states, 'we conduct all our experiments using Stable Diffusion v1.4 (Rombach et al., 2022) as the base diffusion model. We run the DDIM (Song et al., 2020) sampler for 50 time steps.' However, it does not specify any hardware details such as GPU models, CPU types, or memory.
Software Dependencies No The paper mentions several software components and techniques like 'Stable Diffusion v1.4', 'DDIM sampler', 'Do RA (Liu et al., 2024)', 'Lo RA (Hu et al., 2021)', and 'CLIP (Radford et al., 2021)', but it does not specify exact version numbers for any of these software libraries or frameworks.
Experiment Setup Yes We run the DDIM (Song etol., 2020) sampler for 50 time steps. The intensity of this shift can be continuously modulated by λ: By adjusting λ at inference time, a user can smoothly transition from the original model behavior (λ = 0) to complete concept erasure. We demonstrate this capability qualitatively for NSFW content in Figure 1 (right-panel). We define the Concept Sieve τ as follows: τ = {w i c w i cn}l i=1 (5) The objective function L for this fine-tuning stage is the MSE loss between zt and z t... we progressively increase the percentage of updated columns from 10% to 70%. The noise variance can be varied further to obtain varying degree of separation between the concept and concept-negated dataset. We test different levels noise variance and the layer number of CLIP from which the embedding is extracted in Figure 10 of the supplementary.