reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

DiffPC: Diffusion-based High Perceptual Fidelity Image Compression with Semantic Refinement

Authors: Yichong Xia, Yimin Zhou, Jinpeng Wang, Baoyi An, Haoqian Wang, Yaowei Wang, Bin Chen

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that our method achieves state-of-the-art perceptual fidelity and surpasses previous perceptual image compression methods by a significant margin in statistical fidelity. ... 4 EXPERIMENTS
Researcher Affiliation	Collaboration	Yichong Xia1,3, , Yimin Zhou1, , Jinpeng Wang1, Baoyi An4, Haoqian Wang1, Bin Chen2,3 1Tsinghua Shenzhen International Graduate School, 2Harbin Institute of Technology, Shenzhen 3Peng Cheng Laboratory, 4Huawei Technologies Company Ltd.
Pseudocode	Yes	Our Diff PC framework is illustrated as Figure 2 and the pseudocode for the algorithm can be found in Appendix A.2. ... Algorithm 1 Encoding Process ... Algorithm 2 Decoding Process
Open Source Code	Yes	Code is released at https://github.com/Darc8-sun/DIFFPC.
Open Datasets	Yes	For validation, we referenced (Muckley et al., 2023) and employed three widely recognized image compression benchmark datasets: CLIC2020 (George Toderici, 2020), DIV2K (Timofte et al., 2017), and Kodak (Company). ... Furthermore, following the approaches of (Hoogeboom et al., 2023; Careil et al., 2024), we validated the model s statistical fidelity using COCO30K (Lin et al., 2014) and present the results in the Appendix A.8. Our model was trained on the LSDIR dataset (Li et al., 2023b)
Dataset Splits	Yes	Our model was trained on the LSDIR dataset (Li et al., 2023b), which comprises 84,991 high-definition natural images. ... For validation, we referenced (Muckley et al., 2023) and employed three widely recognized image compression benchmark datasets: CLIC2020 (George Toderici, 2020), DIV2K (Timofte et al., 2017), and Kodak (Company).
Hardware Specification	Yes	Additionally, all experiments were conducted on an Nvidia A6000 GPU. ... except for Perco, all tests were conducted on the same Nvidia 3080ti GPU. Due to high VRAM usage during inference, Perco was tested on an A6000 GPU, which has superior GFLOPS.
Software Dependencies	No	Our foundational conditional diffusion model leverages Stable Diffusion 2.1-base3.. ... For LPIPS, we utilized the lpips library, while DISTS was implemented using DISTS pytorch. FID and KID metrics were calculated using functions provided by torchmetrics.image, with a feature size of 2048.
Experiment Setup	Yes	Our model was trained on the LSDIR dataset (Li et al., 2023b)... During training, these images were randomly cropped to a resolution of 512 x 512. Our foundational conditional diffusion model leverages Stable Diffusion 2.1-base3.. Throughout all training stages, we employed Adam W (Loshchilov, 2017) as the optimizer, with learning rates set at 1e-4 for the initial phase and 5e-5 for the subsequent phase. The batch size was consistently maintained at 2. In the initial training phase, we employed an entropy estimator SCCTX (He et al., 2022) with a group number of 3. To achieve compression at different bit rates, we set the parameter λ2 in Section 3.2 to 0.2 and then adjusted λ1 {4, 16, 64, 128}. At this stage, we will train with 80000 steps. In the second training phase, the parameters of the compressor were frozen. ... At this stage, we will train with 60000 steps. We did not apply warm-up in the first stage but utilized a Lambda Linear Scheduler with parameters warm up steps=10000 and f start=1e-6 in the second stage. For sampling, we utilized IDDPM (Nichol & Dhariwal, 2021) as the sampler with a uniform setting of 50 sampling steps...