reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Q-VDiT: Towards Accurate Quantization and Distillation of Video-Generation Diffusion Transformers

Authors: Weilun Feng, Chuanguang Yang, Haotong Qin, Xiangqi Li, Yu Wang, Zhulin An, Libo Huang, Boyu Diao, Zixiang Zhao, Yongjun Xu, Michele Magno

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our W3A6 Q-VDi T achieves a scene consistency of 23.40, setting a new benchmark and outperforming current state-of-the-art quantization methods by 1.9 . Code will be available at https://github.com/cantbebetter2/Q-VDi T. ... Extensive experiments on generative benchmarks show that Q-VDi T significantly outperforms current SOTA post-training quantization methods.
Researcher Affiliation	Academia	1 Institute of Computing Technology, Chinese Academy of Sciences 2 University of Chinese Academy of Sciences 3 ETH Zurich.
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	Code will be available at https://github.com/cantbebetter2/Q-VDi T.
Open Datasets	Yes	Following previous work Vi Di TQ (Zhao et al., 2024), we apply our Q-VDi T to Open SORA (HPC-AI, 2024) and Latte (Ma et al., 2024) for video generation task. ... We first evaluate the quantized model on VBench (Huang et al., 2024b) ... For Latte, we adopt the class-conditioned Latte model trained on UCF-101 and use the 20-step DDIM solver with CFG scale of 7.0. More details can be found in Appendix Sec. D. ... We employ one randomly selected video per label from the UCF-101 dataset (101 videos in total) (Soomro, 2012) as the reference ground-truth videos for FVD evaluation.
Dataset Splits	No	The paper mentions using 10 prompts from Open SORA and 101 prompts from UCF-101 for evaluation and calibration, and specific prompt sets for VBench evaluation (93 prompts, 72 prompts, 86 prompts), but does not provide details on training/test/validation dataset splits for the underlying models used in the experiments.
Hardware Specification	No	The paper mentions 'GPU memory' and 'GPU Time' in Table 5, but does not provide specific details on the GPU or CPU models, processor types, or memory amounts used for running experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers for libraries or frameworks used in the experiments.
Experiment Setup	Yes	We mainly focus on harder settings of W4A6 (4-bit weight quantization and 6-bit activation quantization), W3A8, and W3A6. ... For post-training quantization, we calibrate 5k iters for 6-8 bit, 10k iters for 4-bit, and 15k iters for 3-bit. For calibration parameters, we use a batch size of 4, learning rate of 1e-6 for weight quantization parameters, and 1e-5 for TQE parameters. ... For the Open-Sora (HPC-AI, 2024) model, we use 100-step DDIM with CFG scale of 4.0. For Latte, we adopt the class-conditioned Latte model trained on UCF-101 and use the 20-step DDIM solver with CFG scale of 7.0.