Modulated Diffusion: Accelerating Generative Modeling with Modulated Quantization

Authors: Weizhi Gao, Zhichao Hou, Junqi Yin, Feiyi Wang, Linyu Peng, Xiaorui Liu

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on CIFAR-10 and LSUN demonstrate that Mo Diff significantly reduces activation quantization from 8 bits to 3 bits without performance degradation in post-training quantization (PTQ).
Researcher Affiliation Collaboration 1Department of Computer Science, North Carolina State University 2National Center for Computational Science, Oak Ridge National Lab 3Department of Mechanical Engineering, Keio University. Correspondence to: Xiaorui Liu <EMAIL>.
Pseudocode No The paper describes the methodology in prose and mathematical formulations, but does not include any explicitly labeled "Pseudocode" or "Algorithm" blocks, nor does it present structured steps in a code-like format.
Open Source Code Yes Our code implementation is available at https: //github.com/Weizhi Gao/Mo Diff.
Open Datasets Yes We majorly evaluate the effectiveness of our Mo Diff on the CIFAR-10 (32 32), LSUN-Bedrooms (256 256), and LSUN-Church-Outdoor (256 256) datasets (Krizhevsky et al., 2009; Yu et al., 2015). For CIFAR-10, we use DDIM models with 100 denoising steps (Song et al., 2021a).
Dataset Splits No The paper mentions using well-known datasets such as CIFAR-10, LSUN, MS-COCO, and ImageNet for evaluation, and specifies the number of images generated for metrics (e.g., '50,000 generated images'). However, it does not explicitly provide specific details about the training, validation, or test splits used for these datasets, nor does it reference predefined standard splits by name in the context of their usage.
Hardware Specification No Implementing acceleration on specialized hardware is beyond the scope of this work, but will be a promising future direction, which is plausible given the increasing hardware support for low-precision formats such as 4-bit integers (Dave et al., 2019).
Software Dependencies No The paper mentions using 'Deep Speed' for measuring binary operations and conducting 'Q-Diffusion experiments by directly using their provided code', but it does not specify version numbers for any software dependencies like programming languages (e.g., Python) or libraries (e.g., PyTorch, Deep Speed).
Experiment Setup Yes We use DDIM models with 100 denoising steps (Song et al., 2021a). For the LSUN datasets, we use Latent Diffusion Models with downsampling factors of 4 and 8... We use 500 sampling steps for LDM-4 and 200 steps for LDM-8.