A Note on the Convergence of Denoising Diffusion Probabilistic Models

Authors: Sokhna Diarra Mbacke, Omar Rivasplata

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The goal of these experiments is to assess the numerical value of the bound of Theorem 3.1 on a synthetic dataset. The data-generating distribution is chosen to be the uniform distribution on the square of side 2, centered at the origin. Figure 2 shows samples from this target distribution. The backward process uses a shared network with fully connected layers, and 128 hidden units each. The model is trained on 50, 000 samples from the original distribution, and the bound is computed with n = 5, 000 independent samples. Samples from the trained model are shown in Figure 3. We computed the bound for different values of λ.
Researcher Affiliation Academia Sokhna Diarra Mbacke EMAIL Université Laval Omar Rivasplata EMAIL University College London
Pseudocode No The paper describes mathematical derivations and theoretical results. It does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an explicit statement about releasing source code or a link to a code repository for the methodology described.
Open Datasets No The data-generating distribution is chosen to be the uniform distribution on the square of side 2, centered at the origin. Figure 2 shows samples from this target distribution. This is a synthetic dataset generated for the purpose of the paper, but no specific access information (link, citation, or generation script) is provided to reproduce this exact dataset.
Dataset Splits No The model is trained on 50,000 samples from the original distribution, and the bound is computed with n = 5,000 independent samples. While sample sizes are mentioned for training and bound computation, there is no explicit description of traditional training/validation/test splits for the model's evaluation.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU, CPU models) used for running the experiments.
Software Dependencies No The paper does not provide specific software names along with their version numbers.
Experiment Setup Yes The backward process uses a shared network with fully connected layers, and 128 hidden units each. The model is trained on 50, 000 samples from the original distribution, and the bound is computed with n = 5, 000 independent samples. We estimated the Lipschitz norms Kt θ using K t from Remark 3.2, and the expected norms in the last two terms of Theorem 3.1 are estimated using 106 independent samples from each distribution.