An Undetectable Watermark for Generative Image Models
Authors: Samuel Gunn, Xuandong Zhao, Dawn Song
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally demonstrate that our watermarks are quality-preserving and robust using Stable Diffusion 2.1. Our experiments verify that, in contrast to every prior scheme we tested, our watermark does not degrade image quality. Our experiments also demonstrate robustness: existing watermark removal attacks fail to remove our watermark from images without significantly degrading the quality of the images. |
| Researcher Affiliation | Academia | Sam Gunn UC Berkeley EMAIL Xuandong Zhao UC Berkeley EMAIL Dawn Song UC Berkeley EMAIL |
| Pseudocode | Yes | Our PRC consists of four algorithms, given in Appendix B: PRC.Key Gen(n, F, t) samples a PRC key k, which will also serve as the watermarking key. The parameter n is the block length, which in our case is the dimension of the latent space; F is the desired false positive rate; and t is a parameter which may be increased for improved undetectability at the cost of robustness. PRC.Encodek samples a PRC codeword. PRC.Detectk(c) tests whether the given string c came from the PRC. PRC.Decodek(c) decodes the message from the given string c, if it exists. The decoder is slower and less robust than the detector. ... Algorithm Generate(π, z(T )) : (1) For i = T down to 1: (2) z(i 1) fϵ(π, z(i), i) (3) x D(z(0)) (4) Output x ... Algorithm Sample(π) : (1) Sample z(T ) N(0, In) (2) Compute x Generate(π, z(T )) (3) Output x ... Algorithm Recover(x) : (1) Compute an initial estimate z(0) E(x) of the de-noised latent. (2) For i = 0 to T 1: (3) z(i+1) gδ(z(i), i) (4) Output z(T ) |
| Open Source Code | Yes | Our code is available at https://github.com/Xuandong Zhao/PRC-Watermark. |
| Open Datasets | Yes | We evaluate watermarking methods on two datasets: MS-COCO (Lin et al., 2014) and the Stable Diffusion Prompt (SDP) dataset. ... We test the PRC watermark on a VAE with a 256-dimensional latent space, trained on the Celeb A dataset (Liu et al., 2018). |
| Dataset Splits | Yes | We generate 500 un-watermarked images using MS-COCO captions or SDP prompts, and apply post-processing watermark methods to generate watermarked images. ... To evaluate detectability, we use Res Net18 (He et al., 2016) as the backbone model and train it on 7,500 un-watermarked images and 7,500 watermarked images (or 7,500 images watermarked with key 1 and 7,500 with key 2) to perform binary classification. |
| Hardware Specification | Yes | All experiments are conducted on NVIDIA H100 GPUs. |
| Software Dependencies | Yes | Specifically, we evaluate the performance of various watermarking schemes using the Stable Diffusion-v2.16 model... For our implementation of Generate, we employ Stable Diffusion with DPM-solvers (Lu et al., 2022) for sampling... To evaluate detectability, we use Res Net18 (He et al., 2016) as the backbone model... We use the Galois package of Hostetter (2020) for conveniently handling linear algebra over F2. We use the belief propagation implementation of Roffe (2022) to decode messages in the watermark. |
| Experiment Setup | Yes | All images are generated at a resolution of 512 512 with a latent space of 4 64 64. During inference, we apply a classifier-free guidance scale of 3.0 and sample over 50 steps using DPMSolver (Lu et al., 2022). As described in Section 3, we perform diffusion inversion using the exact inversion method from Hong et al. (2023) to obtain the latent variable z(T ). In particular, we use 50 inversion steps and an inverse order of 0 to expedite detection, balancing accuracy and computational efficiency. ... Specifically, we train the model for 10 epochs with a batch size of 128 and a learning rate of 1e-4. |