Manifold Constraint Reduces Exposure Bias in Accelerated Diffusion Sampling
Authors: Yuzhe YAO, Jun Chen, Zeyi Huang, Haonan Lin, Mengmeng Wang, Guang Dai, Jingdong Wang
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate the effectiveness of our approach, achieving a FID score of 15.60 with 10-step SDXL on MS-COCO, surpassing the baseline by a reduction of 2.57 in FID. Section 5, titled 'EXPERIMENTS', details these empirical evaluations. |
| Researcher Affiliation | Collaboration | The authors are affiliated with: 1Xi an Jiaotong University, 2Zhejiang Normal University, 3Huawei, 4Zhejiang University of Technology, 5 SGIT AI Lab, State Grid Corporration of China, 6 Baidu. This includes both academic institutions (universities) and industry entities (Huawei, SGIT AI Lab, Baidu), indicating a collaborative affiliation. |
| Pseudocode | Yes | The paper includes 'Algorithm 1 Pre-Computation for vt' and 'Algorithm 2 MCDO Sampling', which are clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statements about code availability or links to a code repository in the main text or supplementary material references. |
| Open Datasets | Yes | We conduct experiments on high-resolution datasets across various tasks. Specifically, we evaluate MS-COCO (Lin et al., 2014) with SDXL (Podell et al., 2024), and Celeb AHQ (Karras et al., 2018), Image Net-256 (Russakovsky et al., 2015), and LSUN-Bedroom (256 256) with LDM-4 (Rombach et al., 2022). |
| Dataset Splits | Yes | Results on Text-to-Image generation on MS-COCO val2014 with SDXL and DPM-Solver++. Image Resolution is 1024 1024 (Table 10). This indicates the use of the 'val2014' split of the MS-COCO dataset for evaluation. |
| Hardware Specification | No | The paper does not explicitly describe any specific hardware used for running its experiments (e.g., GPU models, CPU models, or cloud resources with specifications). |
| Software Dependencies | No | The paper mentions models like 'SDXL (Podell et al., 2024)', 'LDM-4 (Rombach et al., 2022)', and 'DPM-Solver++ (Lu et al., 2022b)', but it does not provide specific version numbers for these or any underlying software components like programming languages, libraries (e.g., PyTorch, TensorFlow), or CUDA versions. |
| Experiment Setup | Yes | To implement MCDO, we use N = 20, T = 1000 for pre-computation. For TS-DDIM (Li et al., 2024a) we use tc = 300, w = 4 for cutoff timestep and window size. A uniform epsilon scale: λ = 1.008 is applied for DDIM-ES (Ning et al., 2024). For TS-DDIM (Li et al., 2024a), we set tc = 100, w = 60 for 20 and 10 steps sampling, and tc = 200, w = 30 for 5 steps. Results under the recommended setting (η = 0.0) (Rombach et al., 2022) are shown in Table 2, with λ representing the epsilon scaling factor from Ning et al. (2024). For implementing DDIM-MCDO, variance statistics from 64 samples are collected during 500 steps of DDIM sampling. In all experiments on Celeb A-HQ and LSUNBedroom, we set the threshold timestep tthre to 0. guidance scale s = 3.0 (Table 4). |