reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

On Disentangled Training for Nonlinear Transform in Learned Image Compression

Authors: Han Li, Shaohui Li, Wenrui Dai, Maida Cao, Nuowen Kan, Chenglin Li, Junni Zou, Hongkai Xiong

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results demonstrate that the proposed approach can accelerate training of LIC models by 2 times and simultaneously achieves an average 1% BD-rate reduction. To our best knowledge, this is one of the first successful attempt that can significantly improve the convergence of LIC with comparable or superior rate-distortion performance. We perform ablation studies to further evaluate the effectiveness of our proposed Aux T.
Researcher Affiliation	Academia	1Shanghai Jiao Tong University, 2Tsinghua Shenzhen International Graduate School, Tsinghua University EMAIL, EMAIL
Pseudocode	No	The paper describes the method using prose, equations, and diagrams (Figure 5) but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	Code will be released at https://github.com/qingshi9974/Aux T
Open Datasets	Yes	All the models are trained on the Image Net-1k (Deng et al., 2009) dataset and optimized using Adam optimizer (Kingma & Ba, 2015). We adopt three benchmark datasets, i.e., Kodak image set (Kodak, 1993) with 24 images of 768 512 pixels, Tecnick testset (Asuni & Giachetti, 2014) with 100 images of 1200 1200 pixels, and CLIC Professional Validation dataset (CLIC, 2021) with 41 images of at most 2K resolution, for evaluations.
Dataset Splits	No	The paper mentions using ImageNet-1k for training and Kodak, Tecnick testset, and CLIC Professional Validation dataset for evaluation. While the latter are standard test sets, the paper does not specify the training/validation split for ImageNet-1k or other training data needed to reproduce the experimental setup.
Hardware Specification	Yes	Experiments are performed on NVIDIA Ge Force RTX 4090 GPU and Intel Xeon Platinum 8260 CPU.
Software Dependencies	No	The paper mentions using the Adam optimizer, but does not provide specific version numbers for any software libraries, frameworks, or programming languages used.
Experiment Setup	Yes	We set the batch size to 16 for convolution-based LIC models (Minnen et al., 2018; He et al., 2022) and 8 for transformer-based LIC models Zou et al. (2022); Liu et al. (2023). We train the models without our Aux T for 0.6M and 2M iterations respectively, and train the models with our Aux T for 0.6M and 1M iterations, respectively. The learning rate is initialized as 10 4 and is decayed by a factor of 10 after 0.55M iterations for 0.6M training iterations scenario, after 0.9M iterations for 1M training iterations scenario, and after 1.8M iterations for 2M training iterations scenario. The Lagrangian multiplier λ in the R-D loss used for training MSE-optimized models are {0.0025, 0.0035, 0.0067, 0.0130, 0.0250, 0.0483}, and those for MS-SSIM-optimized models are {2.40, 4.58, 8.73, 16.64, 31.73, 60.50}. The orthogonal regularization weight λorth is 0.1.