A Timestep-Adaptive Frequency-Enhancement Framework for Diffusion-based Image Super-Resolution
Authors: Yueying Li, Hanbin Zhao, Jiaqing Zhou, Guozhi Xu, Tianlei Hu, Gang Chen, Haobo Wang
IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on three benchmark datasets verify the superior ISR performance of our method, e.g., achieving an average 5.40% improvement on CLIP-IQA compared to the best diffusion-based ISR baseline. |
| Researcher Affiliation | Collaboration | Yueying Li1,2 , Hanbin Zhao3 , Jiaqing Zhou4 , Guozhi Xu4 , Tianlei Hu2,3 , Gang Chen2,3 and Haobo Wang1,2 1School of Software Technology, Zhejiang University 2Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security 3College of Computer Science and Technology, Zhejiang University 4Byte Dance, Hangzhou |
| Pseudocode | No | The paper describes the proposed framework and its modules (TDC, APEM, HLEM) in natural language and illustrates them with a diagram (Figure 3), but it does not contain any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our source code and appendix are available at https://github.com/liyueying233/TFDSR. |
| Open Datasets | Yes | Training Datasets. We train TFDSR on the first 10K realworld images from LSDIR [Li et al., 2023b] and first 10k face images from FFHQ [Karras et al., 2021], which are cropped into 512 512 patches. ... Testing Datasets. ... (1) For the synthetic dataset, we use 3,000 generated pairs of LR-HR images from the DIV2K validation set [Agustsson and Timofte, 2017], ... (2) For the real-world datasets, we utilize the DReal SR [Wei et al., 2020] and Real SR [Ji et al., 2020] datasets... |
| Dataset Splits | Yes | Training Datasets. We train TFDSR on the first 10K realworld images from LSDIR [Li et al., 2023b] and first 10k face images from FFHQ [Karras et al., 2021]... Testing Datasets. ... we use 3,000 generated pairs of LR-HR images from the DIV2K validation set [Agustsson and Timofte, 2017]... For the real-world datasets, we utilize the DReal SR [Wei et al., 2020] and Real SR [Ji et al., 2020] datasets... Hyperparameters TAP = 400, THL = 500, PH = 0.05, and PL = 0.9 are tuned using a validation set composed of 100 randomly selected images from the training set (LSDIR+FFHQ)... |
| Hardware Specification | Yes | Then we train the APEM for 600 iterations with a batch size of 32, a learning rate of 5 10 5, and 512 512 resolution on a single A100 GPU. |
| Software Dependencies | No | The paper mentions using a pre-trained baseline model (See SR) and general types of models like diffusion models (DDPM, LDM, U-Net) but does not provide specific version numbers for software libraries or dependencies like PyTorch, TensorFlow, Python, or CUDA. |
| Experiment Setup | Yes | We train the APEM for 600 iterations with a batch size of 32, a learning rate of 5 10 5, and 512 512 resolution on a single A100 GPU. During sampling, we utilize the adaptive frequency sampling strategy using the TDC module, which dynamically selects enhanced frequency components based on the current sampling timestep, with a total sampling step of 50. Hyperparameters TAP = 400, THL = 500, PH = 0.05, and PL = 0.9 are tuned using a validation set... |