D^2-DPM: Dual Denoising for Quantized Diffusion Probabilistic Models

Authors: Qian Zeng, Jie Song, Han Zheng, Hao Jiang, Mingli Song

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate that D2-DPM achieves superior generation quality, yielding a 1.42 lower FID than the full-precision model while achieving 3.99x compression and 11.67x bit-operation acceleration. Experiments Settings Dataset and Metrics. We evaluated proposed D2-DPM using LDM (Rombach et al. 2022) across three standard datasets: Image Net, LSUN-Bedrooms, and LSUN-Churches (Yu et al. 2015), each with a resolution of 256 256. To quantify generation performance, we employ metrics such as Frechet Inception Distance (FID), Sliding Fr echet Inception Distance (s FID), Inception Score (IS), precision, and recall for comprehensive evaluation. For each evaluation, we generate 50,000 samples and calculate these metrics using the Open AI s evaluator (Dhariwal and Nichol 2021), with BOPs (Bit Operations) as the efficiency metric. Ablation Study As shown in Table 5, we perform ablation studies on the denoising components of dual denoising mechanisms, S-D2 and D-D2.
Researcher Affiliation Collaboration Qian Zeng1, Jie Song1*, Han Zheng1, Hao Jiang2, Mingli Song1 1 Zhejiang University 2 Alibaba Group EMAIL, EMAIL
Pseudocode Yes Algorithm 1 summarizes the procedure of the proposed dual denoising mechanism.
Open Source Code Yes Code https://github.com/Taylor Jocelyn/D2-DPM
Open Datasets Yes We evaluated proposed D2-DPM using LDM (Rombach et al. 2022) across three standard datasets: Image Net, LSUN-Bedrooms, and LSUN-Churches (Yu et al. 2015), each with a resolution of 256 256.
Dataset Splits No The paper mentions collecting calibration data but does not specify train/test/validation splits for the datasets used in the main experiments, or if standard splits are used. It states: "For calibration, we collect the diffusion model s inputs at each sampling timestep as the calibration set." There is no further detail on dataset splitting.
Hardware Specification No The paper does not explicitly mention any specific hardware (GPU, CPU models, or memory details) used for running the experiments. It generally discusses "resource-constrained scenarios" and "edge devices" but does not specify the hardware used for their own evaluation.
Software Dependencies No The paper does not provide specific software dependencies or version numbers for libraries, frameworks, or programming languages used in the experiments.
Experiment Setup Yes LDM settings. We primarily focus on the generative sampler parameters in LDM: classifier-free guidance scale, sampling step and variance schedule η. Since LDM employs the DDIM sampler, it degrades to an ODE-based sampler with zero stochasticity capacity when η = 0, becomes an SDE-based DDPM sampler with inherent stochasticity capacity when η = 1. Therefore, we simulate stochasticity capacity changes by adjusting the scale. In class-conditional generation, we set four parameter configurations: {scale = 3.0, η = 0.0|1.0, steps = 20} and {scale = 1.5, η = 0.0|1.0, steps = 250}. For unconditional generation, we set two parameter configurations: {η = 0.0|1.0, steps = 200}. Quantization Settings. We employ BRECQ (Li et al. 2021) as the PTQ baseline for extensive comparative experiments and implement an LDM-compatible version of Qrop (Wei et al. 2022). To ensure comparability, we keep all settings aligned with PTQD, specifically: 1) using Adaround (Nagel et al. 2020) as the weight quantizer; and 2) fixing the first and last layers to 8 bits, while quantizing other layers to the target bit-width. For calibration, we collect the diffusion model s inputs at each sampling timestep as the calibration set. Notationally, Wx Ay indicates that weights and activations are quantized to x and y bits. In all experiments, we adopt two quantization configurations: W8A8 and W4A8.