CFG++: Manifold-constrained Classifier Free Guidance for Diffusion Models
Authors: Hyungjin Chung, Jeongsol Kim, Geon Yeong Park, Hyelin Nam, Jong Chul YE
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results confirm that our method significantly enhances performance in text-to-image generation, DDIM inversion, editing, and solving inverse problems, suggesting a wide-ranging impact and potential applications in various fields that utilize text guidance. Project Page: https://cfgpp-diffusion.github.io. ... In this section, we design experiments to show the limitations of CFG and how CFG++ can effectively mitigate these downsides. The main experiments were conducted with SD v1.5 or SDXL with 50 NFE DDIM sampling. ... In Tab. 1, we report quantitative metrics using 10k images generated from COCO captions (Lin et al., 2014). |
| Researcher Affiliation | Academia | Hyungjin Chung , Jeongsol Kim , Geon Yeong Park , Hyelin Nam , Jong Chul Ye KAIST : Equal Contribution EMAIL |
| Pseudocode | Yes | Algorithm 1 Reverse Diffusion with CFG Algorithm 2 Reverse Diffusion with CFG++ |
| Open Source Code | No | Project Page: https://cfgpp-diffusion.github.io. |
| Open Datasets | Yes | Using the corresponding scales for ω and λ, we directly compare the performance of the T2I task using SD v1.5 and SDXL. In Tab. 1, we report quantitative metrics using 10k images generated from COCO captions (Lin et al., 2014). ... For evaluation, we use the FFHQ (Karras et al., 2019) 512x512 dataset |
| Dataset Splits | Yes | For evaluation, we use the FFHQ (Karras et al., 2019) 512x512 dataset and follow (Chung et al., 2023a) by selecting the first 1,000 images for testing. |
| Hardware Specification | No | No specific hardware details (like GPU/CPU models or specific computer specifications) are mentioned in the paper for running experiments. |
| Software Dependencies | No | The paper mentions software like "SD v1.5", "SDXL", "SDXL-Turbo", "SDXL-Lightning", "DPM++ 2M", and "k-diffusion" but does not provide specific version numbers for any of these components, which are required for reproducibility. |
| Experiment Setup | Yes | We fix λ = 0.2, 0.4, 0.6, 0.8, 1.0 and find the ω values that produce the images that are of closest proximity in terms of LPIPS distance given the same seed. We found that the corresponding values were ω = 2.0, 5.0, 7.5, 9.0, 12.5, respectively. ... For gradient updates in both vanilla PSLD and PSLD with CFG, we use static step sizes of η = 1.0 and γ = 0.1 as recommended in (Rout et al., 2024). For CFG scale ω, we applied the corresponing scales that we found to be corresponded for CFG++ scale λ. Please refer to the Tab. 5 for the hyperparameters used for PSLD with CFG++. |