Differentiable Solver Search for Fast Diffusion Sampling
Authors: Shuai Wang, Zexian Li, Qipeng Zhang, Tianhui Song, Xubin Li, Tiezheng Ge, Bo Zheng, Limin Wang
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Equipped with the searched solver, rectified-flow models, e.g., Si TXL/2 and Flow DCN-XL/2, achieve FID scores of 2.40 and 2.35, respectively, on Image Net256 256 with only 10 steps. Meanwhile, DDPM model of Di T-XL/2 reaches a FID score of 2.33 with only 10 steps. Notably, our searched solver significantly outperforms the traditional solvers, even for some distillation-based methods. Moreover, our searched solver demonstrates generality across various model architectures, resolutions, and model sizes. |
| Researcher Affiliation | Collaboration | 1State Key Lab of Novel Software Technology, Nanjing University, Nanjing, China. 2Taobao & Tmall Group of Alibaba, Hangzhou, China. 3Shanghai AI Lab, Shanghai, China.. Correspondence to: Limin Wang <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Solver Parametrization ... Algorithm 2 Differentiable Solver Search |
| Open Source Code | No | We will open source the training code of unified training techniques. |
| Open Datasets | Yes | On Image Net-256 256, armed with our solver, rectified-flow models of Si T-XL/2 and Flow DCNXL/2 achieve 2.40 and 2.35 FID respectively under 10 steps, while DDPM model of Di T-XL/2 achieves 2.33 FID. We choose the open-source Di T-XL/2(Peebles & Xie, 2023) model trained on Image Net 256 256 as the search model to carry out the experiments. We follow the evaluation pipeline of ADM and take COCO17-Val as the reference batch. |
| Dataset Splits | No | The paper uses standard datasets like ImageNet and COCO but does not explicitly state the training, validation, and test splits used for the models or experiments. While it mentions 'We sample 50,000 images for the entire search process,' this refers to sampling during the solver search, not data splits for model training or evaluation. |
| Hardware Specification | Yes | Notably, searching with 50,000 samples using Flow DCN-B/2 requires approximately 30 minutes on 8 H20 computation cards. |
| Software Dependencies | No | Our default training setting employs the Lion optimizer (Chen et al., 2024c) with a constant learning rate of 0.01 and no weight decay. No specific version numbers for software or libraries are provided. |
| Experiment Setup | Yes | Our default training setting employs the Lion optimizer (Chen et al., 2024c) with a constant learning rate of 0.01 and no weight decay. We sample 50,000 images for the entire search process. |