Universal Image Restoration Pre-training via Degradation Classification
Authors: Jiakui Hu, Lujia Jin, Zhengjian Yao, Yanye Lu
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our DCPT in four different settings. (1) All-in-One: A single model is fine-tuned after DCPT to perform image restoration across multiple degradation. Following previous state-of-the-art works (Zhang et al., 2023; Luo et al., 2023b), we evaluate on five restoration (5D) tasks and 10D tasks. (2) Single-task: We report the performance on unseen degradation of all-in-one trained models without fine-tuning following (Zhang et al., 2023; Ai et al., 2024).To highlight DCPT s impact on single-task pre-training, we present fine-tuning results of DCPT pre-trained models in specific singletask settings. (3) Mixed degradation: We evaluate the fine-tuned model under mixed degradation to verify whether DCPT is suitable for the restoration of complex mixed-degraded images, such as composite weather. (4) Transfer learning: We evaluate the transfer learning capability of restoration models trained by DC-guided training or not between different image restoration tasks. |
| Researcher Affiliation | Collaboration | 1Institute of Medical Technology, Peking University Health Science Center, Peking University 2Biomedical Engineering Department, College of Future Technology, Peking University 3National Biomedical Imaging Center, Peking University 4China Mobile Research Institute |
| Pseudocode | Yes | To illustrate the simplicity and efficacy of DCPT, we present the Py Torch-like code of DCPT here. We hope that this code will further improve the reproducibility of DCPT. ### train to generate the clean image encoder.train() decoder.eval() optimizer_encoder.zero_grad() pix_output = encoder(gt, hook=False) l_total = 0 # pixel loss if cri_pixel: l_pix = cri_pixel(pix_output, gt) l_total += l_pix ### train to classify the degradation decoder.train() optimizer_decoder.zero_grad() hook_outputs = encoder(lq, hook=True) cls_output = decoder(lq, hook_outputs[::-1]) # classification loss if cri_cls: l_cls = cri_cls(cls_output, dataset_idx) l_total += l_cls l_total.backward() optimizer_encoder.step() optimizer_decoder.step() |
| Open Source Code | Yes | Source code and models are available at https://github.com/MILab-PKU/dcpt. |
| Open Datasets | Yes | For 3D and 5D all-in-one restoration, following Air Net (Li et al., 2022), a combination of various image restoration datasets is employed: Rain200L (Yang et al., 2017), which contains 200 training images for deraining; RESIDE (Li et al., 2018), which contains 72,135 training images and 500 test images (SOTS) for dehazing; BSD400 (Martin et al., 2001b) and WED (Ma et al., 2016), which contain 5,144 training images for Gaussian denoising; Go Pro (Nah et al., 2017), which contains 2,103 training images and 1,111 test images for single image motion deblurring; and LOL (Wei et al., 2018), which contains 485 training images and 15 test images for low-light enhancement. |
| Dataset Splits | Yes | Before the experiment, we randomly divide the training set and the test set in a ratio of 2:1. We ensure that the data volume of each degradation in the training set and the test set is evenly distributed. |
| Hardware Specification | Yes | During DCPT, image restoration models (encoder) and degradation classifiers (decoder) are all trained by Adam W (Kingma & Ba, 2014) with no weight decay for 100k iters with batch-size 32 on 128 128 image patches on 4 NVIDIA L40 GPUs. |
| Software Dependencies | No | The paper implies the use of PyTorch through "PYTORCH-LIKE CODE", and mentions "Adam W (Kingma & Ba, 2014)" as an optimizer. However, it does not provide specific version numbers for these software components or any other libraries used. |
| Experiment Setup | Yes | During DCPT, image restoration models (encoder) and degradation classifiers (decoder) are all trained by Adam W (Kingma & Ba, 2014) with no weight decay for 100k iters with batch-size 32 on 128 128 image patches on 4 NVIDIA L40 GPUs. Due to the heterogeneous encoder-decoder design, we employ distinct learning rates for encoder and decoder. The learning rate is set to 3 10 4 for encoder and 1 10 4 for decoder. The learning rate does not alter during the DCPT. |