WMAdapter: Adding WaterMark Control to Latent Diffusion Models

Authors: Hai Ci, Yiren Song, Pei Yang, Jinheng Xie, Mike Zheng Shou

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical results show that WMAdapter provides strong flexibility, superior image quality, and competitive watermark robustness. Code: https://github.com/ showlab/WMAdapter
Researcher Affiliation Academia 1Show Lab, National University of Singapore, Singapore. Correspondence to: Mike Zheng Shou <EMAIL>.
Pseudocode No The paper describes the methodology using text and architectural diagrams (Figure 2, Figure 3) but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code: https://github.com/ showlab/WMAdapter
Open Datasets Yes ALL training and finetuning steps are performed on MS-COCO 2017 (Lin et al., 2014) training set.
Dataset Splits Yes ALL training and finetuning steps are performed on MS-COCO 2017 (Lin et al., 2014) training set. Validation is performed on COCO 2017 validation set.
Hardware Specification Yes For the first stage training, we adopt 8 NVIDIA A5000 GPUs of 24 GB memory, with per-GPU batchsize of 2, Adam W optimizer (Loshchilov & Hutter, 2017), a learning rate of 5e-4. We train the model for 2 epochs, taking about 5 hours. For the second stage finetuning, we use a single A5000 GPU.
Software Dependencies No The paper mentions 'Adam W optimizer' but does not specify particular software libraries or frameworks with version numbers that would be required to replicate the experiment.
Experiment Setup Yes For the first stage training, we adopt 8 NVIDIA A5000 GPUs of 24 GB memory, with per-GPU batchsize of 2, Adam W optimizer (Loshchilov & Hutter, 2017), a learning rate of 5e-4. We train the model for 2 epochs, taking about 5 hours. For the second stage finetuning, we use a single A5000 GPU. We set the mini-batch to 2. We also use the Adam W optimizer and a start learning rate of 5e-4. However, we adopt a per-step cosine learning rate decay with 20 warm-up steps. Unless otherwise specified, the total fine-tuning process defaults to 2,000 steps, lasting for about 50 minutes.