DiffusionGuard: A Robust Defense Against Malicious Diffusion-based Image Editing

Authors: William June Suk Choi, Kyungmin Lee, Jongheon Jeong, Saining Xie, Jinwoo Shin, Kimin Lee

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive experiments, we show that our method achieves stronger protection and improved mask robustness with lower computational costs compared to the strongest baseline. Additionally, our method exhibits superior transferability and better resilience to noise removal techniques compared to all baseline methods. For concrete evaluation, we introduce Inpaint Guard Bench, a challenging evaluation benchmark designed to assess defense methods against image editing models. We conduct human surveys and measure qualitative metrics to evaluate Diffusion Guard. Through extensive experiments, we demonstrate both qualitatively and quantitatively that Diffusion Guard is effective, and most importantly, robust against changes in mask inputs.
Researcher Affiliation Academia June Suk Choi1, Kyungmin Lee1, Jongheon Jeong2, Saining Xie3, Jinwoo Shin1, Kimin Lee1 1KAIST, 2Korea University, 3NYU
Pseudocode Yes The full procedure of mask augmentation is summarized in Algorithm 1.
Open Source Code Yes Our source code is publicly available at our project page: https://choi403.github.io/diffusionguard.
Open Datasets No Our benchmark, named Inpaint Guard Bench, consists of 42 images, each associated with five unique masks. Out of these, one mask per image is generated using SAM (Kirillov et al., 2023), a state-of-the-art segmentation method, and the remaining four masks are handcrafted using the most common tools employed by end-users. The 32 celebrity images were collected from the web, and consist of 20 front-view images and 12 side-view images of racial and domain diversity. 10 non-human images were sourced from the Dream Booth (Ruiz et al., 2023) dataset. We visualize all images that we have used in Fig. 8 and all masks that we have used in Fig. 9. While the paper describes the creation and composition of the 'Inpaint Guard Bench' and references some source datasets like Dream Booth, it does not provide a direct link, DOI, or explicit statement of public availability for the 'Inpaint Guard Bench' dataset itself, nor does it state that it is included in supplementary materials for download.
Dataset Splits Yes Our benchmark contains 42 images, divided into three categories: 32 celebrity portraits, 5 inanimate objects, and 5 animals. We consider 10 edit prompts for each image, resulting in a total of 2,100 edit tasks (42 images, 5 masks, and 10 prompts). For generating adversarial noises, we use the SAM-generated mask as the training ("seen") mask. We then evaluate the effectiveness of the generated adversarial perturbations on all 5 masks, including the handcrafted 4 "unseen" masks.
Hardware Specification Yes We conduct all our experiments on a single NVIDIA H100 80GB HBM3 GPU.
Software Dependencies No The paper mentions using DDIM (Song et al., 2021) sampler and DPM-Solver (Lu et al., 2022) but does not specify version numbers for general software dependencies like Python, PyTorch, or CUDA.
Experiment Setup Yes For fair comparison, we match the time taken for running PGD optimization to 90 seconds in all comparisons throughout the paper. Additionally, we fix the random seed for a reliable comparison of the edited results of different methods, and we also follow the same projected gradient descent (PGD) (Madry et al., 2018) optimization configuration proposed by each method. For the generation of adversarial perturbation, we fix the input text prompt to an empty string ("") to maximize generalization to any test-time prompt. After protection is done, each image is edited using DDIM (Song et al., 2021) sampler with 50 inference steps, following the default implementation of Stable Diffusion Inpainting (Rombach et al., 2023).