AdvI2I: Adversarial Image Attack on Image-to-Image Diffusion Models

Authors: Yaopei Zeng, Yuanpu Cao, Bochuan Cao, Yurui Chang, Jinghui Chen, Lu Lin

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive experiments, we demonstrate that both Adv I2I and Adv I2I-Adaptive can effectively bypass current safeguards, highlighting the urgent need for stronger security measures to address the misuse of I2I diffusion models.
Researcher Affiliation Academia 1College of Information Sciences and Technology, Pennsylvania State University, State College PA, USA. Correspondence to: Yaopei Zeng <EMAIL>, Lu Lin <EMAIL>.
Pseudocode Yes Algorithm 1 Adversarial Image Attack on Image-to-Image Diffusion models: Adv I2I
Open Source Code Yes The code is available at https://github.com/Spinozaaa/Adv I2I.
Open Datasets Yes The images are sourced from the sexy category of the NSFW Data Scraper (Kim, 2020), consisting predominantly of the human bodies.
Dataset Splits Yes Then, we randomly select 200 images and 10 text prompts from each set to construct 2000 image-text samples, in which 1800 samples are used for training adversarial image generators and the remaining 200 samples are for evaluation.
Hardware Specification No The paper does not explicitly state the specific hardware (e.g., GPU models, CPU types, memory amounts) used for conducting the experiments.
Software Dependencies No The paper mentions various models and tools used (e.g., CLIP, VAE, Chat GPT-4o, Nude Net, Q16 classifier, Stable Diffusion versions) but does not provide specific version numbers for underlying software libraries or programming languages (e.g., PyTorch, Python, CUDA versions) used for their implementation.
Experiment Setup Yes Specifically, we set the guidance scale to 1000, the warmup step to 7, the threshold to 0.01, the momentum scale to 0.3, and β to 0.4. (...) Here we follow (H onig et al., 2024) to set the variance of Gaussian noise as 0.05. (...) where ϵG denotes the random Gaussian noise, and µ is the hyper-parameter to control the scale of Lsc. (...) The constraint in Eq. (2) is to ensure that the generated image gψ(x) also stays close to the original image x. To solve this constraint optimization problem, we apply a clipping function to the generated adversarial image, ensuring that the difference between gψ(x) and the input image x remains within the predefined noise bound ϵ after each update step. In practice, we set t = 1 in Eq. (2) since the latent feature at the final timestep1 directly influences the content of the generated image.