Classifier-Free Guidance is a Predictor-Corrector

Authors: Arwen Bradley, Preetum Nakkiran

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental For demonstration purposes, we implement the PCG sampler for Stable Diffusion XL and observe that it produces samples qualitatively similar to CFG, with guidance scales determined by our theory. Further, we explore the design axes exposed by the PCG framework, namely guidance strength and Langevin iterations, to clarify their respective effects. ... Table 1 shows FID scores (Heusel et al., 2017) on Image Net (Russakovsky et al., 2015), using EDM2 pretrained diffusion models (Karras et al., 2024b).
Researcher Affiliation Industry Arwen Bradley Apple, Cupertino CA, USA; Preetum Nakkiran Apple, Cupertino CA, USA
Pseudocode Yes Algorithm 1 PCGDDIM, Theory. (See Algorithm 2 for practical implementation.) ... Algorithm 2 PCGDDIM, explicit.
Open Source Code No No concrete access to source code is provided. The paper states: 'We do not intend to propose PCG as a practical sampling method (since with certain parameters it is equivalent to CFG, but far less efficient), but rather as a tool for understanding CFG.'
Open Datasets Yes Table 1 shows FID scores (Heusel et al., 2017) on Image Net (Russakovsky et al., 2015), using EDM2 pretrained diffusion models (Karras et al., 2024b).
Dataset Splits No No specific dataset split information is provided. The paper states: 'Metrics are calculated using 50,000 samples and 200 sampling steps, generated using EDM2 checkpoints...'
Hardware Specification No No specific hardware details are mentioned for running the experiments. The paper refers to using 'EDM2 pretrained diffusion models' but does not specify the hardware on which their experiments were conducted.
Software Dependencies No No specific ancillary software details with version numbers are provided. The paper mentions using 'EDM2 pretrained diffusion models' but no other software dependencies or versions.
Experiment Setup Yes We run CFGDDPM with 200 denoising steps, and PCGDDIM with 100 denoising steps and K = 1 Langevin step per denoising step. Corresponding samples appear to have qualitatively similar guidance strengths, consistent with our theory. ... All samples used 1000 denoising steps for the base predictor. Overall, we observed that increasing Langevin steps tends to improve the overall image quality, while increasing guidance strength tends to improve prompt adherence.