MPQ-DM: Mixed Precision Quantization for Extremely Low Bit Diffusion Models

Authors: Weilun Feng, Haotong Qin, Chuanguang Yang, Zhulin An, Libo Huang, Boyu Diao, Fei Wang, Renshuai Tao, Yongjun Xu, Michele Magno

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive experiments demonstrate that MPQ-DM achieves significant accuracy gains under extremely low bit-widths compared with SOTA quantization methods. MPQ-DM achieves a 58% FID decrease under W2A4 setting compared with baseline, while all other methods even collapse. 4 Experiment 4.1 Experiment Settings 4.2 Experiment Results 4.3 Ablation Study
Researcher Affiliation Academia 1Institute of Computing Technology, Chinese Academy of Sciences 2University of Chinese Academy of Sciences 3ETH Zurich 4Beijing Jiaotong University
Pseudocode No The paper describes the methods textually and with figures, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code https://github.com/cantbebetter2/MPQ-DM
Open Datasets Yes We conduct experiments on commonly used datasets LSUN-Bedrooms 256 256, LSUN-Churches 256 256 (Yu et al. 2015), and Image Net 256 256 (Deng et al. 2009) for both unconditional and conditional image generation tasks on LDM models. We also conduct text-to-image generation task on Stable Diffusion (Rombach et al. 2022). We use IS (Salimans et al. 2016), FID (Heusel et al. 2017), s FID (Nash et al. 2021) and Precision to evaluate LDM performance. For Stable Diffusion, we use CLIP Score (Hessel et al. 2021) for evaluation. ... We conduct text-to-image generation experiment on randomly selected 10k COCO2014 validation set prompts over Stable Diffusion v1.4 model with 512 512 resolution.
Dataset Splits Yes We conduct text-to-image generation experiment on randomly selected 10k COCO2014 validation set prompts over Stable Diffusion v1.4 model with 512 512 resolution.
Hardware Specification No The paper mentions general statements about resource-constrained scenarios and edge devices, but does not provide specific hardware details (e.g., GPU models, CPU types, memory) used for running the experiments.
Software Dependencies No The paper does not provide specific ancillary software details with version numbers (e.g., Python, PyTorch, CUDA versions) needed to replicate the experiment.
Experiment Setup No The paper mentions some methodological settings like allocating an additional 10% number of channels for 2-bit quantization for MPQ-DM+ and empirically setting k=10 for search groups. However, it does not explicitly state crucial training hyperparameters such as learning rate, batch size, number of epochs, or optimizer settings in the main text. It mentions 'Details can be found in Appendix', but these are not provided in the main paper content.