FreqTS: Frequency-Aware Token Selection for Accelerating Diffusion Models
Authors: Xinye Yang, Yuxin Yang, Haoran Pang, Aaron Xuxiang Tian, Luking Li
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments conducted on Stable Diffusion series models, Pix Art-Alpha, LCM, and other models demonstrate that Freq TS achieves a minimum acceleration of 2.3 without the need for retraining. Furthermore, Freq TS showcases its versatility by being applicable to different sampling techniques and compatible with other dimension-specific acceleration algorithms. |
| Researcher Affiliation | Academia | Xinye Yang1, Yuxin Yang2, Haoran Pang3, Aaron Xuxiang Tian4, Luking Li4 1Newcastle University, 2University of Hong Kong, 3National University of Singapore, 4Independent Researcher EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes the methodology using textual explanations and mathematical equations, but it does not present any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain an explicit statement or a link indicating that the source code for the methodology is openly available or will be provided in supplementary materials. |
| Open Datasets | Yes | For quantitative evaluation, we employed the Fr echet Inception Distance (FID) (Heusel et al. 2017) as a measure of image quality. Specifically, we calculated the FID for Stable Diffusion (SD) (Rombach et al. 2022b) and SDXL (Podell et al. 2023) using 5k samples from the MS-COCO dataset (Lin et al. 2014). |
| Dataset Splits | No | The paper mentions using "5k samples from the MS-COCO dataset" but does not specify how these samples were split into training, validation, or test sets, nor does it provide percentages or counts for different splits. |
| Hardware Specification | Yes | All experiments are conducted on a GPU server equipped with an NVIDIA Ge Force RTX 4090. Latency measurements are performed using Py Torch with a batch size of 1 on the Ge Force RTX 4090. |
| Software Dependencies | No | The paper mentions "Py Torch" but does not specify a version number or other key software dependencies with their versions. |
| Experiment Setup | Yes | Latency measurements are performed using Py Torch with a batch size of 1 on the Ge Force RTX 4090. By adjusting the token sort threshold, our method offers a trade-off between computational efficiency and generative quality. At a 40% token sort threshold, our method achieves a latency of 4.183, outperforming the widely-used DDIM method while maintaining a competitive FID score of 6.40. As we increase the token sort threshold to 70%, the latency is further reduced to 2.492, albeit with a slightly higher FID of 7.83. |