reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Training-Free and Hardware-Friendly Acceleration for Diffusion Models via Similarity-based Token Pruning

Authors: Evelyn Zhang, Jiayi Tang, Xuefei Ning, Linfeng Zhang

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The excellent performance of diffusion models in image generation is always accompanied by overlarge computation costs, which have prevented the application of diffusion models in edge devices and interactive applications. Previous works mainly focus on using fewer sampling steps and compressing the denoising network of diffusion models, while this paper proposes to accelerate diffusion models by introducing Si To, a similarity-based token pruning method that adaptive prunes the redundant tokens in the input data. Si To is designed to maximize the similarity between model prediction with and without token pruning by using cheap and hardware-friendly operations, leading to significant acceleration ratios without performance drop, and even sometimes improvements in the generation quality. For instance, the zero-shot evaluation shows Si To leads to 1.90x and 1.75x acceleration on COCO30K and Image Net with 1.33 and 1.15 FID reduction at the same time. Besides, Si To has no training requirements and does not require any calibration data, making it plug-and-play in real-world applications.
Researcher Affiliation	Academia	Evelyn Zhang1, Jiayi Tang2, Xuefei Ning3, Linfeng Zhang1* 1School of Artificial Intelligence, Shanghai Jiao Tong University 2School of Computer Science and Technology, China University of Mining and Technology 3Department of Electronic Engineering, Tsinghua University EMAIL
Pseudocode	No	The paper describes the pipeline of Si To in Figure 3 with three stages: Base Token Selection, Pruned Token Selection, and Pruned Token Recovery. These steps are described textually and with diagrams but not in a formal pseudocode or algorithm block.
Open Source Code	Yes	Code https://github.com/Evelyn Zhang-epiclab/Si To
Open Datasets	Yes	We generate 2,000 images of Image Net-1k (Deng et al. 2009) (2 per class) and 30,000 images of COCO30k classes (1 per caption) for evaluation.
Dataset Splits	Yes	We generate 2,000 images of Image Net-1k (Deng et al. 2009) (2 per class) and 30,000 images of COCO30k classes (1 per caption) for evaluation.
Hardware Specification	Yes	The average latency for generate an image and speedup are measured on a single 4090 GPU.
Software Dependencies	No	The paper mentions 'CUDA' but does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	Our experiments are conducted with SD v1.5 and SD v2 by generating 512 512 images using 50 PLMS (Liu et al. 2022) steps with a cfg scale (Dhariwal and Nichol 2021) of 7.5 and 9.0, respectively.