Plug-and-Play Tri-Branch Invertible Block for Image Rescaling

Authors: Jingwei Bao, Jinhua Hao, Pengcheng Xu, Ming Sun, Chao Zhou, Shuyuan Zhu

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments confirm that our method advances the state of the art in HR image reconstruction. ... 4 Experiments 4.1 Experimental Setup Datasets and Settings. ... 4.2 Evaluation for T-IRN Quantitative Evaluation. ... Ablation Study. ... Qualitative Evaluation.
Researcher Affiliation Collaboration Jingwei Bao1,2, Jinhua Hao2* , Pengcheng Xu2* , Ming Sun2, Chao Zhou2, Shuyuan Zhu1* 1 University of Electronic Science and Technology of China, Chengdu, China 2 Kuaishou Technology, Beijing, China
Pseudocode No The paper describes the mathematical formulations for the transformations within the T-Inv Block (e.g., Equations 1, 2, and 3) and general methodological steps, but it does not present any explicitly labeled 'Pseudocode' or 'Algorithm' block with structured, code-like steps.
Open Source Code Yes Code https://github.com/Jingwei-Bao/T-Inv Blocks
Open Datasets Yes We adopt 800 HR images from the widely-used DIV2K training set (Agustsson and Timofte 2017) to train our models. For evaluation, T-IRN is assessed on five standard test sets: Set5 (Bevilacqua et al. 2012), Set14 (Zeyde, Elad, and Protter 2010), BSD100 (Martin et al. 2001), Urban100 (Huang, Singh, and Ahuja 2015), and the DIV2K validation set (Agustsson and Timofte 2017).
Dataset Splits Yes We adopt 800 HR images from the widely-used DIV2K training set (Agustsson and Timofte 2017) to train our models. For evaluation, T-IRN is assessed on five standard test sets: Set5 (Bevilacqua et al. 2012), Set14 (Zeyde, Elad, and Protter 2010), BSD100 (Martin et al. 2001), Urban100 (Huang, Singh, and Ahuja 2015), and the DIV2K validation set (Agustsson and Timofte 2017).
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory configurations used for the experiments. It only details training parameters and software components.
Software Dependencies No The paper mentions using the Adam optimizer (Kingma and Ba 2014) but does not specify any software libraries or frameworks (e.g., PyTorch, TensorFlow) or their specific version numbers.
Experiment Setup Yes The 2 model is trained for 800k iterations, halving the learning rate every 100k iterations, while the 4 model is trained for 600k iterations, with the learning rate halved every 40k iterations. Loss weights are set to λ1 = 1 and λ2 = 0.25 to balance LR and HR loss and enhance HR detail restoration. For T-SAIN 2 image rescaling, we employ a single downscaling module and a T-Inv Block as a compression simulator, doubling the modules for 4. Both T-SAIN models are trained for 600k iterations, with the learning rate halved every 100k iterations. The training setup follows SAIN (Yang et al. 2023), including the loss function and JPEG codec ε with a QF of 75. In all T-IRN and T-SAIN experiments, the initial learning rate is 2 10 4, using L2 pixel loss for Llr and L1 pixel loss for Lhr in RGB space. Input images are cropped to 128 128 and augmented with random flips. We use the Adam optimizer (Kingma and Ba 2014) with β1 = 0.9, β2 = 0.999, and a mini-batch size of 16.