The Missing U for Efficient Diffusion Models
Authors: Sergio Calvo Ordoñez, Chun-Wun Cheng, Jiahao Huang, Lipei Zhang, Guang Yang, Carola-Bibiane Schönlieb, Angelica I Aviles-Rivero
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimenting with Denoising Diffusion Probabilistic Models (DDPMs), our framework operates with approximately a quarter of the parameters, and 30% of the Floating Point Operations (FLOPs) compared to standard U-Nets in DDPMs. In this section, we detail the set of experiments to validate our proposed framework. |
| Researcher Affiliation | Academia | Sergio Calvo-Ordoñez1,2, Chun-Wun Cheng3, Jiahao Huang4, Lipei Zhang3, Guang Yang4,5,6, Carola-Bibiane Schönlieb3, Angelica I Aviles-Rivero3 1Oxford-Man Institute of Quantitative Finance, University of Oxford 2Mathematical Institute, University of Oxford 3Department of Applied Mathematics and Theoretical Physics, University of Cambridge 4Bioengineering Department and Imperial-X & National Heart and Lung Institute, Imperial College London 5Cardiovascular Research Centre, Royal Brompton Hospital London 6School of Biomedical Engineering Imaging Sciences, King s College London |
| Pseudocode | No | The paper describes the methodology using textual explanations, mathematical formulations, and architectural diagrams (e.g., Figure 1, Figure 2), but it does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code, a direct link to a code repository, or mention of code being included in supplementary materials for the methodology described. |
| Open Datasets | Yes | Table 1: Performance metrics across datasets: FID scores (measured every 5 steps for Celeb A and LSUN, and at every step for MNIST), sampling timesteps (Steps), and average generation time for both the U-Net and continuous U-Net (c U-Net) models. |
| Dataset Splits | No | The paper mentions using datasets like MNIST, Celeb A, and LSUN Church for evaluation and refers to a 'test dataset' in Table 2, but it does not provide specific details on the training, validation, and test splits for these datasets. |
| Hardware Specification | No | Note that inference times reported for both models were measured on a CPU, as current Python ODE-solver packages do not utilise GPU resources effectively, unlike the highly optimised code of conventional U-Net convolutional layers. |
| Software Dependencies | No | The paper mentions 'Python ODE-solver packages' in a general context but does not provide specific names or version numbers for any software dependencies required to replicate the experiments. |
| Experiment Setup | Yes | To compute the FID, we generated two datasets, each containing 30,000 generated samples from each of the models, in the same way as we generated the images shown in the figures above. These new datasets are then directly used for the FID score computation with a batch size of 512 for the feature extraction. We also note that we use the 2048-dimensional layer of the Inception network for feature extraction as this is a common choice to capture higher-level features. |