Erasing Concept Combination from Text-to-Image Diffusion Model
Authors: hongyi nie, Quanming Yao, Yang Liu, Zhen Wang, Yatao Bian
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments across diverse visual concept combination scenarios verify the effectiveness of COGFD. As CCE is a newly defined task, we designed an evaluation framework incorporating six assessment metrics and conducted thorough experiments across three datasets from different scenarios 2. |
| Researcher Affiliation | Collaboration | 1Northwestern Polytechnical University 2Tsinghua University 3 Tencent AI Lab |
| Pseudocode | Yes | The details of the iterative graph generation strategy are illustrated in Algorithm 1. the overall Algorithm of COGFD to address the CCE task is illustrated in Algorithm 2. |
| Open Source Code | Yes | 2The code is available at: https://github.com/Sirius11311/CoGFD-ICLR25. |
| Open Datasets | Yes | Unlearn Canvas (Zhang et al., 2024b) is a state-of-the-art benchmark dataset in the field of Concept Erasing, providing 1,000 distinct visual concept combinations of 20 common objects and 50 different painting styles. COCO30K contains 30,000 images featuring combinations of common visual objects. Additionally, we created a new dataset named Harmful Cmb, which includes 10 inappropriate image themes, with each theme corresponding to a set of 100 highrisk concept combinations constructed from several harmless concepts. Our code and dataset can be obtained from https://anonymous.4open.science/r/CoGFD-F788. |
| Dataset Splits | Yes | For each theme, we selected one concept combination from the set to fine-tune the stable diffusion model, and the remaining 99 concept combinations were used for evaluation. It is important to note that each theme contains 100 samples, with one sample used for fine-tuning and the remaining 99 samples used for evaluation. |
| Hardware Specification | Yes | In this work, all experiments are conducted on a machine with NVIDIA A6000*2 GPUs, each GPU has 48G memory. |
| Software Dependencies | No | For concept logic graph generation, We use Auto Gen (Wu et al., 2023) to construct the interaction of two agents and use GPT4 as the base model for the agent by calling the interface of GPT4. Since the code repo in (Zhang et al., 2024b) has uniformly organized and encapsulated the original codes of each baseline method, we use the source codes in (Zhang et al., 2024b) as code base. |
| Experiment Setup | Yes | For high-level feature decoupling, α is set as 0.1, and we only fine-tune the parameters in cross-attention layers. For the Harmful Cmb and Unlearn Canvas datasets, the iteration times K are 2 and 1, respectively. Algorithm 2: Input: a textual concept combination ˆm, a text-to-image diffusion model ϕθ, Epoch num. E, Sample times N. |