reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

DGQ: Distribution-Aware Group Quantization for Text-to-Image Diffusion Models

Authors: Hyogon Ryu, NaHyeon Park, Hyunjung Shim

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We tested our method on various datasets, including MS-COCO (Lin et al., 2014) and Parti Prompts (Yu et al., 2022), and confirmed its superior performance in generating high-quality and text-aligned images. Our method achieved a reduction of 1.29 in FID score compared to full precision and an almost identical CLIP score (a decrease of only 0.001) on MS-COCO dataset, while saving 93.7% in bit operations (from 694 TBOPs to 43.4 TBOPs).
Researcher Affiliation	Academia	Hyogon Ryu Na Hyeon Park Hyunjung Shim Korea Advanced Institute of Science and Technology (KAIST) EMAIL
Pseudocode	No	The paper describes the methodology in detail in Section 3.3 "DISTRIBUTION-AWARE GROUP QUANTIZATION" but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Code is available at https://github.com/ugonfor/DGQ.
Open Datasets	Yes	We tested our method on various datasets, including MS-COCO (Lin et al., 2014) and Parti Prompts (Yu et al., 2022), and confirmed its superior performance in generating high-quality and text-aligned images.
Dataset Splits	Yes	The dataset used for calibration during quantization was generated using 64 captions from the MS-COCO Dataset (Lin et al., 2014). Similar to the approach taken in Tang et al. (2023), we evaluated prompt generalization performance using the Parti Prompts (Yu et al., 2022) dataset, which differs from the calibration dataset. For the text-to-image model, we used Stable Diffusion v1.4. We measured FID (Heusel et al., 2017) and IS (Salimans et al., 2016) scores to evaluate image quality, and the CLIP score to evaluate text-image alignment. For main results (Table 2), we compute the FID and IS using 30K samples. For the ablation study (Table 3), we use 10K samples.
Hardware Specification	Yes	On the other hand, DGQ used only 64 sample prompts during the activation quantization process and was completed in about 20 minutes on just one RTX A6000 (based on Stable Diffusion v1.4 with 25 steps).
Software Dependencies	No	The paper mentions employing the "diffusers 3" library, but does not specify its exact version or other key software components with their respective version numbers (e.g., Python, PyTorch, CUDA).
Experiment Setup	Yes	Unless specified otherwise, we apply 25 inference steps for computational efficiency. ... For Outlier-preserving Group Quantization, a group size of 8 was used. ... For Attention-aware Quantization, we applied a Log Quantizer, separating the <start> token, and utilized dynamic quantization.