Image-level Memorization Detection via Inversion-based Inference Perturbation
Authors: Yue Jiang, Haokun Lin, Yang Bai, Bo Peng, Zhili Liu, Yueming Lyu, Yong Yang, Xingzheng, Jing Dong
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We formalize image-level memorization detection task for text-to-image diffusion models across various datasets with a comprehensive setup, which assists in inspecting the security of training images against memorization risks. Extensive experiments demonstrate that our method achieves state-of-the-art performance in detecting memorized images in various settings. Additionally, our IIP framework shows strong robustness against data augmentation attacks. We conduct preliminary experiments to uncover clues for identifying memorized images. Our approach primarily involves perturbed inference to explore potential features. Specifically, we remove original prompts at different timesteps and replace them with meaningless ones (e.g., ata ) during the subsequent generation period. Interestingly, we discover two key characteristics: after permutation operations, 1) the similarity of images changes in a distinct pattern, and 2) the MTCNP remains larger for memorized images. We apply these metrics to detect memorization in 2000 images, with the corresponding ROC plots presented in Fig. 4 (bottom). Both metrics effectively detect memorization, with AUC scores exceeding 0.9. We conduct the ablation study on both the training and generated datasets, and the results are present in Tab. 5. We also analyze our performance with initializations at different timesteps on the training dataset in Fig. 7. |
| Researcher Affiliation | Collaboration | Yue Jiang1,2 Haokun Lin1,2,3 Yang Bai4 Bo Peng1 Zhili Liu5 Yueming Lyu6 Yong Yang4 Xing Zheng4 Jing Dong1 1 NLPR, MAIS, Institute of Automation, Chinese Academy of Sciences 2 School of Artificial Intelligence, UCAS 3 City University of Hong Kong 4 Tencent Security Platform Department 5 HKUST 6 Nanjing University |
| Pseudocode | No | The paper describes the methodology through equations and textual explanations, for example in Sections 2.1, 2.2, 3.1, and 3.2. However, it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks with structured, code-like steps. |
| Open Source Code | Yes | Our code and datasets are available at https://github.com/joellejiang/IIP. |
| Open Datasets | Yes | Datasets. We evaluate our method across three dataset types for comprehensive analysis in various settings: 1) memorized and non-memorized training images, 2) memorized training images and non-training images, and 3) generated images from memorized and non-memorized prompts. All settings in the main paper are based on Stable Diffusion v1.4 (Rombach et al., 2022) and results on other versions of Stable Diffusion are provided in the appendix. For memorized training images, we follow Webster (2023) and collect 152 memorized images from the training set of SD v1.4. ... Since LAION-mi (Dubi nski et al., 2024) extract part of the training set and non-member sets for Stable Diffusion v1.4, we select 1200 non-member images from LAION-mi as non-memorized non-training images. For generated images, we follow the setting of Wen et al. (2024), selecting 500 memorized prompts from Webster (2023) for Stable Diffusion v1.4 and 500 non-memorized prompts from various sources, including LAION (Schuhmann et al., 2022), COCO (Lin et al., 2014), Lexica.art, and randomly generated strings. |
| Dataset Splits | Yes | For memorized training images, we follow Webster (2023) and collect 152 memorized images from the training set of SD v1.4. ... retaining 1200 images. ... we select 1200 non-member images from LAION-mi as non-memorized non-training images. For generated images, we follow the setting of Wen et al. (2024), selecting 500 memorized prompts from Webster (2023) for Stable Diffusion v1.4 and 500 non-memorized prompts from various sources, including LAION (Schuhmann et al., 2022), COCO (Lin et al., 2014), Lexica.art, and randomly generated strings. Each prompt generates 4, 8, and 16 images, resulting in a total of 2K, 4K, and 8K generated images for both types of prompts. |
| Hardware Specification | Yes | All experiments are conducted on a single A100. |
| Software Dependencies | No | The paper mentions models and methods like 'Stable Diffusion v1.4 (Rombach et al., 2022)', 'DDIM sampling (Song et al., 2020)', and 'classifier-free guidance (Ho & Salimans, 2022)'. However, it does not provide specific version numbers for underlying software libraries or programming languages (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | Implementation Details. All baselines employ the same DDIM sampling (Song et al., 2020), with inference steps and guidance scales consistently set to 50 and 7.5, respectively. Importantly, none of the methods access the original prompts for detection. In our experiment, we set t I = 20 and t I = 10, and hyperparameters λd = 1.0, λe = 1.0. |