Image Reconstruction via Deep Image Prior Subspaces

Authors: Riccardo Barbano, Javier Antoran, Johannes Leuschner, José Miguel Hernández-Lobato, Bangti Jin, Zeljko Kereta

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments across both image restoration and tomographic tasks of different geometry and ill-posedness show that second order optimisation within a low-dimensional subspace is favourable in terms of optimisation stability to reconstruction fidelity trade-off.
Researcher Affiliation Academia Riccardo Barbano EMAIL Department of Computer Science University College London Javier Antorán EMAIL Department of Engineering University of Cambridge Johannes Leuschner EMAIL University of Bremen José Miguel Hernández-Lobato EMAIL Department of Engineering University of Cambridge Bangti Jin EMAIL The Chinese University of Hong Kong Željko Kereta EMAIL Department of Computer Science University College London
Pseudocode Yes Algorithm 1 Early stop criterion Inputs: metric g( ), patience p, decrease proportion δ 1 gmin , i 0, imin while i imin + p do 2 if g(i) < δ gmin then 3 gmin g(i) and imin i Output: imin
Open Source Code Yes The method is built on top of the E-DIP library (github.com/educating-dip). The full implementation and data are available at github.com/subspace-dip.
Open Datasets Yes Our experiments cover a wide range of image restoration and tomographic reconstruction tasks. In Section 5.1, we conduct an ablation study on Cartoon Set (Royer et al., 2017). We compare reconstructions against ground-truth (Der Sarkissian et al., 2019). To investigate a medical setting, we use 10 clinical CT images of the human abdomen released by Mayo Clinic (Moen et al., 2021). We conduct denoising and deblurring on five widely used RGB natural images ( baboon , jet F16 , house , Lena , peppers ) of size (256 px)2. The pre-training is done on Image Net (Deng et al., 2009).
Dataset Splits No The paper uses various datasets (Cartoon Set, µCT Walnut, Mayo Clinic, Set5) and specifies the number of images used for testing/evaluation (e.g., "25 images", "10 clinical CT images", "five widely used RGB natural images"). However, it does not provide explicit training, validation, or test splits with percentages or sample counts for reproducing the *data partitioning* for its own experiments. It mentions that "validation-based stopping criteria are often not viable in the unsupervised setting" and that a different approach for pre-training data generation was used.
Hardware Specification Yes Runs are performed on A100 GPU.
Software Dependencies No The paper mentions several software tools and libraries that were used or built upon, such as the "E-DIP library" (github.com/educating-dip), "ODL" (Adler et al., 2017), and "ASTRA Toolbox" (van Aarle et al., 2015). However, it does not specify explicit version numbers for these software components, which is required for a reproducible description of ancillary software.
Experiment Setup Yes We simulate observations using (1) with dynamically scaled Gaussian noise given by ϵ ∼ N(0, σ2Idy) with σ = p/dy i=1 |yi|, with the noise scaling parameter set to p = 0.05, unless noted otherwise. For studies in Sections 5.2, 5.3 and 5.4, we use a standard fully convolutional U-Net architecture with either 3M (for CT reconstruction tasks) or 1.9M (for natural images) parameters, see architectures in Appendix C.3. For the ablative analysis in Section 5.1, we use a shallower architecture (.5M) with only 64 channels and four scales, keeping the skip connections in lower layers. Following the literature, we train the vanilla DIP (Ulyanov et al., 2018) (labelled DIP) and E-DIP (Barbano et al., 2022b) with ADAM. We train subspace coefficients with Adam (Sub-DIP Adam), which serves as a baseline, L-BFGS (Sub-DIP L-BFGS) and NGD (Sub-DIP NGD). Pre-training: minimising (4) over 32k images of ellipses with random shape, location and intensity. We construct a dsub=4k dimensional subspace and sparsify it down to dlev/dθ=0.5 of the parameters. For the sparse setting (100 angles), we use a dsub = 4k dimensional subspace constructed from dpre=5k checkpoints, but with sparsity ratio dlev/dθ=0.25. For the more data-rich setting (300 angles), we use dsub = 8k, sampled from dpre=10k checkpoints, and similarly, we sparsify it down to dlev/dθ=0.25. Algorithm 1, cf. Appendix A, to (2), with δ=0.995 and patience of p=100. (Also p=1000 steps and δ=1 for Walnut data).