FreqTS: Frequency-Aware Token Selection for Accelerating Diffusion Models

Authors: Xinye Yang, Yuxin Yang, Haoran Pang, Aaron Xuxiang Tian, Luking Li

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments conducted on Stable Diffusion series models, Pix Art-Alpha, LCM, and other models demonstrate that Freq TS achieves a minimum acceleration of 2.3 without the need for retraining. Furthermore, Freq TS showcases its versatility by being applicable to different sampling techniques and compatible with other dimension-specific acceleration algorithms.
Researcher Affiliation Academia Xinye Yang1, Yuxin Yang2, Haoran Pang3, Aaron Xuxiang Tian4, Luking Li4 1Newcastle University, 2University of Hong Kong, 3National University of Singapore, 4Independent Researcher EMAIL, EMAIL, EMAIL
Pseudocode No The paper describes the methodology using textual explanations and mathematical equations, but it does not present any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not contain an explicit statement or a link indicating that the source code for the methodology is openly available or will be provided in supplementary materials.
Open Datasets Yes For quantitative evaluation, we employed the Fr echet Inception Distance (FID) (Heusel et al. 2017) as a measure of image quality. Specifically, we calculated the FID for Stable Diffusion (SD) (Rombach et al. 2022b) and SDXL (Podell et al. 2023) using 5k samples from the MS-COCO dataset (Lin et al. 2014).
Dataset Splits No The paper mentions using "5k samples from the MS-COCO dataset" but does not specify how these samples were split into training, validation, or test sets, nor does it provide percentages or counts for different splits.
Hardware Specification Yes All experiments are conducted on a GPU server equipped with an NVIDIA Ge Force RTX 4090. Latency measurements are performed using Py Torch with a batch size of 1 on the Ge Force RTX 4090.
Software Dependencies No The paper mentions "Py Torch" but does not specify a version number or other key software dependencies with their versions.
Experiment Setup Yes Latency measurements are performed using Py Torch with a batch size of 1 on the Ge Force RTX 4090. By adjusting the token sort threshold, our method offers a trade-off between computational efficiency and generative quality. At a 40% token sort threshold, our method achieves a latency of 4.183, outperforming the widely-used DDIM method while maintaining a competitive FID score of 6.40. As we increase the token sort threshold to 70%, the latency is further reduced to 2.492, albeit with a slightly higher FID of 7.83.