Erasing Concept Combination from Text-to-Image Diffusion Model

Authors: hongyi nie, Quanming Yao, Yang Liu, Zhen Wang, Yatao Bian

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments across diverse visual concept combination scenarios verify the effectiveness of COGFD. As CCE is a newly defined task, we designed an evaluation framework incorporating six assessment metrics and conducted thorough experiments across three datasets from different scenarios 2.
Researcher Affiliation Collaboration 1Northwestern Polytechnical University 2Tsinghua University 3 Tencent AI Lab
Pseudocode Yes The details of the iterative graph generation strategy are illustrated in Algorithm 1. the overall Algorithm of COGFD to address the CCE task is illustrated in Algorithm 2.
Open Source Code Yes 2The code is available at: https://github.com/Sirius11311/CoGFD-ICLR25.
Open Datasets Yes Unlearn Canvas (Zhang et al., 2024b) is a state-of-the-art benchmark dataset in the field of Concept Erasing, providing 1,000 distinct visual concept combinations of 20 common objects and 50 different painting styles. COCO30K contains 30,000 images featuring combinations of common visual objects. Additionally, we created a new dataset named Harmful Cmb, which includes 10 inappropriate image themes, with each theme corresponding to a set of 100 highrisk concept combinations constructed from several harmless concepts. Our code and dataset can be obtained from https://anonymous.4open.science/r/CoGFD-F788.
Dataset Splits Yes For each theme, we selected one concept combination from the set to fine-tune the stable diffusion model, and the remaining 99 concept combinations were used for evaluation. It is important to note that each theme contains 100 samples, with one sample used for fine-tuning and the remaining 99 samples used for evaluation.
Hardware Specification Yes In this work, all experiments are conducted on a machine with NVIDIA A6000*2 GPUs, each GPU has 48G memory.
Software Dependencies No For concept logic graph generation, We use Auto Gen (Wu et al., 2023) to construct the interaction of two agents and use GPT4 as the base model for the agent by calling the interface of GPT4. Since the code repo in (Zhang et al., 2024b) has uniformly organized and encapsulated the original codes of each baseline method, we use the source codes in (Zhang et al., 2024b) as code base.
Experiment Setup Yes For high-level feature decoupling, α is set as 0.1, and we only fine-tune the parameters in cross-attention layers. For the Harmful Cmb and Unlearn Canvas datasets, the iteration times K are 2 and 1, respectively. Algorithm 2: Input: a textual concept combination ˆm, a text-to-image diffusion model ϕθ, Epoch num. E, Sample times N.