A Timestep-Adaptive Frequency-Enhancement Framework for Diffusion-based Image Super-Resolution

Authors: Yueying Li, Hanbin Zhao, Jiaqing Zhou, Guozhi Xu, Tianlei Hu, Gang Chen, Haobo Wang

IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on three benchmark datasets verify the superior ISR performance of our method, e.g., achieving an average 5.40% improvement on CLIP-IQA compared to the best diffusion-based ISR baseline.
Researcher Affiliation Collaboration Yueying Li1,2 , Hanbin Zhao3 , Jiaqing Zhou4 , Guozhi Xu4 , Tianlei Hu2,3 , Gang Chen2,3 and Haobo Wang1,2 1School of Software Technology, Zhejiang University 2Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security 3College of Computer Science and Technology, Zhejiang University 4Byte Dance, Hangzhou
Pseudocode No The paper describes the proposed framework and its modules (TDC, APEM, HLEM) in natural language and illustrates them with a diagram (Figure 3), but it does not contain any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Our source code and appendix are available at https://github.com/liyueying233/TFDSR.
Open Datasets Yes Training Datasets. We train TFDSR on the first 10K realworld images from LSDIR [Li et al., 2023b] and first 10k face images from FFHQ [Karras et al., 2021], which are cropped into 512 512 patches. ... Testing Datasets. ... (1) For the synthetic dataset, we use 3,000 generated pairs of LR-HR images from the DIV2K validation set [Agustsson and Timofte, 2017], ... (2) For the real-world datasets, we utilize the DReal SR [Wei et al., 2020] and Real SR [Ji et al., 2020] datasets...
Dataset Splits Yes Training Datasets. We train TFDSR on the first 10K realworld images from LSDIR [Li et al., 2023b] and first 10k face images from FFHQ [Karras et al., 2021]... Testing Datasets. ... we use 3,000 generated pairs of LR-HR images from the DIV2K validation set [Agustsson and Timofte, 2017]... For the real-world datasets, we utilize the DReal SR [Wei et al., 2020] and Real SR [Ji et al., 2020] datasets... Hyperparameters TAP = 400, THL = 500, PH = 0.05, and PL = 0.9 are tuned using a validation set composed of 100 randomly selected images from the training set (LSDIR+FFHQ)...
Hardware Specification Yes Then we train the APEM for 600 iterations with a batch size of 32, a learning rate of 5 10 5, and 512 512 resolution on a single A100 GPU.
Software Dependencies No The paper mentions using a pre-trained baseline model (See SR) and general types of models like diffusion models (DDPM, LDM, U-Net) but does not provide specific version numbers for software libraries or dependencies like PyTorch, TensorFlow, Python, or CUDA.
Experiment Setup Yes We train the APEM for 600 iterations with a batch size of 32, a learning rate of 5 10 5, and 512 512 resolution on a single A100 GPU. During sampling, we utilize the adaptive frequency sampling strategy using the TDC module, which dynamically selects enhanced frequency components based on the current sampling timestep, with a total sampling step of 50. Hyperparameters TAP = 400, THL = 500, PH = 0.05, and PL = 0.9 are tuned using a validation set...