T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitching

Authors: Zizheng Pan, Bohan Zhuang, De-An Huang, Weili Nie, Zhiding Yu, Chaowei Xiao, Jianfei Cai, anima anandkumar

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that T-Stitch is training-free, generally applicable for different architectures, and complements most existing fast sampling techniques with flexible speed and quality trade-offs. On Di T-XL, for example, 40% of the early timesteps can be safely replaced with a 10x faster Di T-S without performance drop on class-conditional Image Net generation. We further show that our method can also be used as a drop-in technique to not only accelerate the popular pretrained stable diffusion (SD) models but also improve the prompt alignment of stylized SD models from the public model zoo.
Researcher Affiliation Collaboration 1Monash University 2NVIDIA 3University of Wisconsin, Madison 4Caltech
Pseudocode No The paper describes methods conceptually and with mathematical equations, but does not include any explicit 'Pseudocode' or 'Algorithm' blocks.
Open Source Code No The paper mentions using a 'public model zoo on Diffusers (von Platen et al., 2022)' and 'Hugging Face.co and Civitai.com', which are third-party resources or communities. It does not provide any explicit statement from the authors about releasing their own code for the methodology described in this paper.
Open Datasets Yes Following Di T, we conduct the class-conditional Image Net experiments based on pretrained Di T-S/B/XL under 256 256 images and patch size of 2.
Dataset Splits Yes We use the reference batch from ADM (Dhariwal & Nichol, 2021) and sample 5,000 images to compute FID.
Hardware Specification Yes For example, even with a high-performance RTX 3090, generating 8 images with Di T-XL (Peebles & Xie, 2022) takes 16.5 seconds with 100 denoising steps, which is 10 slower than its smaller counterpart Di T-S (1.7s) with a lower generation quality.
Software Dependencies No The paper mentions 'Diffusers' implicitly through a citation, but no specific version numbers are provided for any software libraries or dependencies used for the implementation of T-Stitch.
Experiment Setup Yes By default, we adopt a classifier-free guidance scale of 1.5 as it achieves the best FID for Di T-XL, which is also the target model in our setting.