reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Once-for-All: Controllable Generative Image Compression with Dynamic Granularity Adaptation

Authors: Anqi Li, Feng Li, Yuxi Liu, Runmin Cong, Yao Zhao, Huihui Bai

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our comprehensive experimental results demonstrate the outstanding adaptation capability of Control-GIC, which achieves superior performance from perceptual quality, flexibility, and compression efficiency over three types of recent state-of-the-art methods including generative, progressive, and variable-rate compression methods using only a single unified model. We evaluate our method on the Kodak (Kodak, 1993), DIV2K (Agustsson & Timofte, 2017), and CLIC2020 (Toderici et al., 2020) datasets.
Researcher Affiliation	Academia	Anqi Li 1,2 Feng Li 3, Yuxi Liu 1,2 Runmin Cong 4 Yao Zhao 1,2 Huihui Bai 1,2, 1 Institute of Information Science, Beijing Jiaotong University 2 Beijing Key Laboratory of Advanced Information Science and Network Technology 3 School of Computer Science and Engineering, Hefei University of Technology 4 School of Control Science and Engineering, Shandong University EMAIL EMAIL, EMAIL
Pseudocode	No	The paper describes the methodology using natural language and mathematical formulations (e.g., Equation 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13) but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain an explicit statement about the release of source code or a link to a code repository.
Open Datasets	Yes	We randomly select 300K images from the Open Images (Krasin et al., 2017) dataset as our training set, where the images are randomly cropped to a uniform 256 256 resolution. We evaluate our method on the Kodak (Kodak, 1993), DIV2K (Agustsson & Timofte, 2017), and CLIC2020 (Toderici et al., 2020) datasets.
Dataset Splits	No	The paper states: 'We randomly select 300K images from the Open Images (Krasin et al., 2017) dataset as our training set, where the images are randomly cropped to a uniform 256 256 resolution.' and 'We evaluate our method on the Kodak (Kodak, 1993), DIV2K (Agustsson & Timofte, 2017), and CLIC2020 (Toderici et al., 2020) datasets.' While it mentions which datasets are used for training and evaluation and their sizes, it does not provide specific training/validation/test splits for reproducibility, nor does it refer to standard splits for these datasets for the model's training and evaluation.
Hardware Specification	Yes	We train the model for 0.6M iterations with the learning rate of 5 10 5 on NVIDIA RTX 3090 GPUs. Throughout the training, we maintain the ratio setting of (50%, 40%, 10%) for the fine, medium, and coarse granularity, respectively.
Software Dependencies	No	Our method is based on Mo VQ (Zheng et al., 2022) which improves the VQGAN model by adding spatial variants to representations within the decoder, avoiding the repeat artifacts in neighboring patches. We leverage the pre-trained codebook in Mo VQ and carefully redesign the architecture.
Experiment Setup	Yes	We train the model for 0.6M iterations with the learning rate of 5 10 5 on NVIDIA RTX 3090 GPUs. Throughout the training, we maintain the ratio setting of (50%, 40%, 10%) for the fine, medium, and coarse granularity, respectively. Within our model, we take three representation granularities: 4 4, 8 8, and 16 16. The codebook C Rk d comprises k = 1024 code vectors, each with a dimension of d = 4.