Manifold Constraint Reduces Exposure Bias in Accelerated Diffusion Sampling

Authors: Yuzhe YAO, Jun Chen, Zeyi Huang, Haonan Lin, Mengmeng Wang, Guang Dai, Jingdong Wang

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate the effectiveness of our approach, achieving a FID score of 15.60 with 10-step SDXL on MS-COCO, surpassing the baseline by a reduction of 2.57 in FID. Section 5, titled 'EXPERIMENTS', details these empirical evaluations.
Researcher Affiliation Collaboration The authors are affiliated with: 1Xi an Jiaotong University, 2Zhejiang Normal University, 3Huawei, 4Zhejiang University of Technology, 5 SGIT AI Lab, State Grid Corporration of China, 6 Baidu. This includes both academic institutions (universities) and industry entities (Huawei, SGIT AI Lab, Baidu), indicating a collaborative affiliation.
Pseudocode Yes The paper includes 'Algorithm 1 Pre-Computation for vt' and 'Algorithm 2 MCDO Sampling', which are clearly labeled algorithm blocks.
Open Source Code No The paper does not contain any explicit statements about code availability or links to a code repository in the main text or supplementary material references.
Open Datasets Yes We conduct experiments on high-resolution datasets across various tasks. Specifically, we evaluate MS-COCO (Lin et al., 2014) with SDXL (Podell et al., 2024), and Celeb AHQ (Karras et al., 2018), Image Net-256 (Russakovsky et al., 2015), and LSUN-Bedroom (256 256) with LDM-4 (Rombach et al., 2022).
Dataset Splits Yes Results on Text-to-Image generation on MS-COCO val2014 with SDXL and DPM-Solver++. Image Resolution is 1024 1024 (Table 10). This indicates the use of the 'val2014' split of the MS-COCO dataset for evaluation.
Hardware Specification No The paper does not explicitly describe any specific hardware used for running its experiments (e.g., GPU models, CPU models, or cloud resources with specifications).
Software Dependencies No The paper mentions models like 'SDXL (Podell et al., 2024)', 'LDM-4 (Rombach et al., 2022)', and 'DPM-Solver++ (Lu et al., 2022b)', but it does not provide specific version numbers for these or any underlying software components like programming languages, libraries (e.g., PyTorch, TensorFlow), or CUDA versions.
Experiment Setup Yes To implement MCDO, we use N = 20, T = 1000 for pre-computation. For TS-DDIM (Li et al., 2024a) we use tc = 300, w = 4 for cutoff timestep and window size. A uniform epsilon scale: λ = 1.008 is applied for DDIM-ES (Ning et al., 2024). For TS-DDIM (Li et al., 2024a), we set tc = 100, w = 60 for 20 and 10 steps sampling, and tc = 200, w = 30 for 5 steps. Results under the recommended setting (η = 0.0) (Rombach et al., 2022) are shown in Table 2, with λ representing the epsilon scaling factor from Ning et al. (2024). For implementing DDIM-MCDO, variance statistics from 64 samples are collected during 500 steps of DDIM sampling. In all experiments on Celeb A-HQ and LSUNBedroom, we set the threshold timestep tthre to 0. guidance scale s = 3.0 (Table 4).