Diffusion Sampling Correction via Approximately 10 Parameters

Authors: Guangyi Wang, Wei Peng, Lijiang Li, Wenyu Chen, Yuren Cai, Song-Zhi Su

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that PAS can significantly enhance existing fast solvers in a plug-and-play manner with negligible costs. E.g., on CIFAR10, PAS optimizes DDIM s FID from 15.69 to 4.37 (NFE=10) using only 12 parameters and sub-minute training on a single A100 GPU. Code is available at https://github.com/onefly123/PAS. The paper includes Section 4. 'Experiments' which details datasets, evaluation metrics (FID), comparisons, and ablation studies.
Researcher Affiliation Collaboration Guangyi Wang is affiliated with 'School of Informatics, Xiamen University, China' (Academia) and 'Xiamen Truesight Technology Co., Ltd, China' (Industry). Wei Peng is affiliated with 'Stanford University, USA' (Academia). The remaining authors are affiliated with universities. Since there is a mix of academic and industry affiliations, it is a collaboration.
Pseudocode Yes The paper contains 'Algorithm 1 PCA-based Adaptive Search (PAS)' and 'Algorithm 2 Sampling Correction' on page 4, which are clearly labeled algorithm blocks.
Open Source Code Yes Code is available at https://github.com/onefly123/PAS.
Open Datasets Yes The paper uses well-known, publicly available datasets such as 'CIFAR10 32 32 (Krizhevsky et al., 2009)', 'FFHQ 64 64 (Karras et al., 2019)', 'Image Net 64 64 (Deng et al., 2009)', 'LSUN Bedroom 256 256 (Yu et al., 2015)', and refers to samples from 'MS-COCO (Lin et al., 2014)' for Stable Diffusion evaluation.
Dataset Splits Yes For Stable Diffusion, we sample 10k samples from the MS-COCO (Lin et al., 2014) validation set to compute the FID, while for other datasets, we uniformly sample 50k samples.
Hardware Specification Yes E.g., on CIFAR10, PAS optimizes DDIM s FID from 15.69 to 4.37 (NFE=10) using only 12 parameters and sub-minute training on a single A100 GPU. For instance, using a single NVIDIA A100 GPU, training on CIFAR10 takes only 0 2 minutes, and merely 10 20 minutes on datasets with a maximum resolution of 256.
Software Dependencies No The paper mentions 'Heun s 2nd solver from EDM (Karras et al., 2022)' and 'torch.pca_lowrank function'. However, it does not provide specific version numbers for software libraries like PyTorch or Python itself.
Experiment Setup Yes Below, we outline some recommended settings for training hyperparameters: utilizing Heun s 2nd solver (Karras et al., 2022) from EDM to generate 5k ground truth trajectories with 100 NFE, employing the L1 loss function, setting the learning rate to 10 2, and using a tolerance τ of 10 4. Table 4 in Appendix B further details training settings including learning rate, loss function, number of ground truth trajectories, and tolerance for different datasets and methods.