reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

The Missing U for Efficient Diffusion Models

Authors: Sergio Calvo Ordoñez, Chun-Wun Cheng, Jiahao Huang, Lipei Zhang, Guang Yang, Carola-Bibiane Schönlieb, Angelica I Aviles-Rivero

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimenting with Denoising Diffusion Probabilistic Models (DDPMs), our framework operates with approximately a quarter of the parameters, and 30% of the Floating Point Operations (FLOPs) compared to standard U-Nets in DDPMs. In this section, we detail the set of experiments to validate our proposed framework.
Researcher Affiliation	Academia	Sergio Calvo-Ordoñez1,2, Chun-Wun Cheng3, Jiahao Huang4, Lipei Zhang3, Guang Yang4,5,6, Carola-Bibiane Schönlieb3, Angelica I Aviles-Rivero3 1Oxford-Man Institute of Quantitative Finance, University of Oxford 2Mathematical Institute, University of Oxford 3Department of Applied Mathematics and Theoretical Physics, University of Cambridge 4Bioengineering Department and Imperial-X & National Heart and Lung Institute, Imperial College London 5Cardiovascular Research Centre, Royal Brompton Hospital London 6School of Biomedical Engineering Imaging Sciences, King s College London
Pseudocode	No	The paper describes the methodology using textual explanations, mathematical formulations, and architectural diagrams (e.g., Figure 1, Figure 2), but it does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide an explicit statement about releasing source code, a direct link to a code repository, or mention of code being included in supplementary materials for the methodology described.
Open Datasets	Yes	Table 1: Performance metrics across datasets: FID scores (measured every 5 steps for Celeb A and LSUN, and at every step for MNIST), sampling timesteps (Steps), and average generation time for both the U-Net and continuous U-Net (c U-Net) models.
Dataset Splits	No	The paper mentions using datasets like MNIST, Celeb A, and LSUN Church for evaluation and refers to a 'test dataset' in Table 2, but it does not provide specific details on the training, validation, and test splits for these datasets.
Hardware Specification	No	Note that inference times reported for both models were measured on a CPU, as current Python ODE-solver packages do not utilise GPU resources effectively, unlike the highly optimised code of conventional U-Net convolutional layers.
Software Dependencies	No	The paper mentions 'Python ODE-solver packages' in a general context but does not provide specific names or version numbers for any software dependencies required to replicate the experiments.
Experiment Setup	Yes	To compute the FID, we generated two datasets, each containing 30,000 generated samples from each of the models, in the same way as we generated the images shown in the figures above. These new datasets are then directly used for the FID score computation with a batch size of 512 for the feature extraction. We also note that we use the 2048-dimensional layer of the Inception network for feature extraction as this is a common choice to capture higher-level features.