Classifier-Free Guidance is a Predictor-Corrector
Authors: Arwen Bradley, Preetum Nakkiran
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | For demonstration purposes, we implement the PCG sampler for Stable Diffusion XL and observe that it produces samples qualitatively similar to CFG, with guidance scales determined by our theory. Further, we explore the design axes exposed by the PCG framework, namely guidance strength and Langevin iterations, to clarify their respective effects. ... Table 1 shows FID scores (Heusel et al., 2017) on Image Net (Russakovsky et al., 2015), using EDM2 pretrained diffusion models (Karras et al., 2024b). |
| Researcher Affiliation | Industry | Arwen Bradley Apple, Cupertino CA, USA; Preetum Nakkiran Apple, Cupertino CA, USA |
| Pseudocode | Yes | Algorithm 1 PCGDDIM, Theory. (See Algorithm 2 for practical implementation.) ... Algorithm 2 PCGDDIM, explicit. |
| Open Source Code | No | No concrete access to source code is provided. The paper states: 'We do not intend to propose PCG as a practical sampling method (since with certain parameters it is equivalent to CFG, but far less efficient), but rather as a tool for understanding CFG.' |
| Open Datasets | Yes | Table 1 shows FID scores (Heusel et al., 2017) on Image Net (Russakovsky et al., 2015), using EDM2 pretrained diffusion models (Karras et al., 2024b). |
| Dataset Splits | No | No specific dataset split information is provided. The paper states: 'Metrics are calculated using 50,000 samples and 200 sampling steps, generated using EDM2 checkpoints...' |
| Hardware Specification | No | No specific hardware details are mentioned for running the experiments. The paper refers to using 'EDM2 pretrained diffusion models' but does not specify the hardware on which their experiments were conducted. |
| Software Dependencies | No | No specific ancillary software details with version numbers are provided. The paper mentions using 'EDM2 pretrained diffusion models' but no other software dependencies or versions. |
| Experiment Setup | Yes | We run CFGDDPM with 200 denoising steps, and PCGDDIM with 100 denoising steps and K = 1 Langevin step per denoising step. Corresponding samples appear to have qualitatively similar guidance strengths, consistent with our theory. ... All samples used 1000 denoising steps for the base predictor. Overall, we observed that increasing Langevin steps tends to improve the overall image quality, while increasing guidance strength tends to improve prompt adherence. |