CFG++: Manifold-constrained Classifier Free Guidance for Diffusion Models

Authors: Hyungjin Chung, Jeongsol Kim, Geon Yeong Park, Hyelin Nam, Jong Chul YE

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results confirm that our method significantly enhances performance in text-to-image generation, DDIM inversion, editing, and solving inverse problems, suggesting a wide-ranging impact and potential applications in various fields that utilize text guidance. Project Page: https://cfgpp-diffusion.github.io. ... In this section, we design experiments to show the limitations of CFG and how CFG++ can effectively mitigate these downsides. The main experiments were conducted with SD v1.5 or SDXL with 50 NFE DDIM sampling. ... In Tab. 1, we report quantitative metrics using 10k images generated from COCO captions (Lin et al., 2014).
Researcher Affiliation Academia Hyungjin Chung , Jeongsol Kim , Geon Yeong Park , Hyelin Nam , Jong Chul Ye KAIST : Equal Contribution EMAIL
Pseudocode Yes Algorithm 1 Reverse Diffusion with CFG Algorithm 2 Reverse Diffusion with CFG++
Open Source Code No Project Page: https://cfgpp-diffusion.github.io.
Open Datasets Yes Using the corresponding scales for ω and λ, we directly compare the performance of the T2I task using SD v1.5 and SDXL. In Tab. 1, we report quantitative metrics using 10k images generated from COCO captions (Lin et al., 2014). ... For evaluation, we use the FFHQ (Karras et al., 2019) 512x512 dataset
Dataset Splits Yes For evaluation, we use the FFHQ (Karras et al., 2019) 512x512 dataset and follow (Chung et al., 2023a) by selecting the first 1,000 images for testing.
Hardware Specification No No specific hardware details (like GPU/CPU models or specific computer specifications) are mentioned in the paper for running experiments.
Software Dependencies No The paper mentions software like "SD v1.5", "SDXL", "SDXL-Turbo", "SDXL-Lightning", "DPM++ 2M", and "k-diffusion" but does not provide specific version numbers for any of these components, which are required for reproducibility.
Experiment Setup Yes We fix λ = 0.2, 0.4, 0.6, 0.8, 1.0 and find the ω values that produce the images that are of closest proximity in terms of LPIPS distance given the same seed. We found that the corresponding values were ω = 2.0, 5.0, 7.5, 9.0, 12.5, respectively. ... For gradient updates in both vanilla PSLD and PSLD with CFG, we use static step sizes of η = 1.0 and γ = 0.1 as recommended in (Rout et al., 2024). For CFG scale ω, we applied the corresponing scales that we found to be corresponded for CFG++ scale λ. Please refer to the Tab. 5 for the hyperparameters used for PSLD with CFG++.