Reward Guided Latent Consistency Distillation
Authors: Jiachen Li, Weixi Feng, Wenhu Chen, William Yang Wang
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform thorough experiments to demonstrate the effectiveness of our RG-LCD. Sec. 5.1 conducts human evaluation to compare the performance of our methods with baselines. Sec. 5.2 further increases the experiment scales to experiment with a wider array of RMs with automatic metrics. By connecting both evaluation results, we identify problems with the current RMs. Finally, Sec. 5.3 conducts ablation studies on critical design choices. |
| Researcher Affiliation | Academia | Jiachen Li EMAIL University of California, Santa Barbara Weixi Feng EMAIL University of California, Santa Barbara Wenhu Chen EMAIL University of Waterloo William Yang Wang EMAIL University of California, Santa Barbara |
| Pseudocode | Yes | Appendix B includes pseudo-codes for our RG-LCD training. Appendix B.3 includes pseudo-codes for training our RG-LCM with an LRM in Algorithm 4. Algorithm 1 Multistep Consistency Sampling Algorithm 2 Multistep Latent Consistency Sampling Algorithm 3 Reward Guided Latent Consistency Distillation Algorithm 4 Reward Guided Latent Consistency Distillation with a Latent Proxy RM |
| Open Source Code | Yes | Project Page: https://rg-lcd.github.io/ |
| Open Datasets | Yes | Our training are conducted on the CC12M datasets (Changpinyo et al., 2021)... contributing to both improved Fréchet Inception Distance (FID) on MS-COCO (Lin et al., 2014) and a higher HPSv2.1 score on HPSv2 (Wu et al., 2023a) s test set... |
| Dataset Splits | No | Our training are conducted on the CC12M datasets (Changpinyo et al., 2021)... We follow a similar evaluation protocol as in (Wallace et al., 2023a) to generate images by conditioning on prompts from Partiprompt (Yu et al., 2022) (1632 prompts) and of HPSv2 s test set (Wu et al., 2023a) (3200 prompts). |
| Hardware Specification | Yes | We distill our LCM from the Stable Diffusion-v2.1 (Rombach et al., 2022) by training for 10K iterations on 8 NVIDIA A100 GPUs without gradient accumulation |
| Software Dependencies | No | We follow the hyperparameter settings listed in the diffusers (von Platen et al., 2022) library by setting learning rate 1e 6, EMA rate µ = 0.95 and the guidance scale range [ωmin, ωmax] = [5, 15]. |
| Experiment Setup | Yes | We distill our LCM from the Stable Diffusion-v2.1 (Rombach et al., 2022) by training for 10K iterations on 8 NVIDIA A100 GPUs without gradient accumulation and set the batch size to reach the maximum capacity of our GPUs. We follow the hyperparameter settings listed in the diffusers (von Platen et al., 2022) library by setting learning rate 1e 6, EMA rate µ = 0.95 and the guidance scale range [ωmin, ωmax] = [5, 15]. As mentioned in Sec. 3.3, we use DDIM (Song et al., 2020a) as our ODE solver Ψ with a skipping step k = 20. Table 2: β for different RG-LCMs when training with different RMs. |