OmniMark: Efficient and Scalable Latent Diffusion Model Fingerprinting

Authors: Jianwei Fei, Yunshu Dai, Zhihua Xia, Fangjun Huang, Jiantao Zhou

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that Omni Mark can be applied to various image generation and editing tasks and achieve highly accurate fingerprint detection without compromising image quality. Furthermore, Omni Mark demonstrates good robustness against both white-box model attacks and image attacks, including fine-tuning and JPEG compression. We conduct extensive experiments on popular Text-to Image T2I and Image-to-Image (I2I) tasks using the open-source Stable Diffusion (SD). Our evaluations demonstrate that Omni Mark successfully embeds 48-bit fingerprints without compromising image quality.
Researcher Affiliation Academia Jianwei Fei1, Yunshu Dai2, Zhihua Xia3, Fangjun Huang2, Jiantao Zhou1 1 State Key Laboratory of Internet of Things for Smart City, Department of Computer and Information Science, University of Macau 2 School of Cyber Science and Technology, Sun Yat-Sen University 3 College of Cyberspace Security, Jinan University EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode Yes Algorithm 1: Fingerprinting LDM with Omni Mark and Sharpness Awareness Embedding Input: E, D, β, mo = 0 Output: Fine-tuned D, 1: for i 1 to total steps do 2: Sample fingerprints m and real images x 3: Initiate Omni Mark layers by Eq. 6. 4: Encode the latent: z = E(x) 5: Reconstruct image: x = D(z; m) 6: Compute image loss li = Li(x , x) 7: Compute fingerprint loss lm = Lm(W(x ), m) 8: Calculate current gradient gi m = θlm|θi 9: Gradient ascent: θi wc = θi + η1gi m 10: Worst-case gradient: gt = θLm(W(x ); m)|θi wc 11: Calculate momentum: mt βmt 1 + (1 β)gt 12: Update: θi+1 θi η2( θli + gi m + mt) 13: end for
Open Source Code Yes Code https://github.com/jumpycat/Omni Mark
Open Datasets Yes Datasets We fine-tuned the LDM VAE decoder D on the MS-COCO-2017 train set (Lin et al. 2014). The evaluations were based on the MS-COCO-2017 val set, Image Net (Russakovsky et al. 2015), and Magic Brush (Zhang et al. 2023).
Dataset Splits Yes Datasets We fine-tuned the LDM VAE decoder D on the MS-COCO-2017 train set (Lin et al. 2014). The evaluations were based on the MS-COCO-2017 val set, Image Net (Russakovsky et al. 2015), and Magic Brush (Zhang et al. 2023). All images were cropped to 512 512 pixels.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies No The paper mentions using "SDv2.0" and "Efficient-B0" as models, but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, CUDA versions).
Experiment Setup Yes The modification is implemented by expanding multiple standard convolutional layers into Omni Mark layers. This enables us to achieve our objectives through fine-tuning. We used 9 Omni Mark layers in practice. The final objective of fine-tuning the VAE decoder with Omni Mark is as follows, λ3 is the weight balancing the 2 losses. In practice, λ1, λ2 are 10.0 and λ3 is 1.0. The momentum parameter β is set as 0.9. We used 48-bit fingerprints for all methods. All images were cropped to 512 512 pixels.