Diffusion models for Gaussian distributions: Exact solutions and Wasserstein errors

Authors: Emile Pierret, Bruno Galerne

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this paper, we theoretically study the behavior of diffusion models and their numerical implementation when the data distribution is Gaussian. Our first contribution is to derive the analytical solutions of the backward SDE and the probability flow ODE and to prove that these solutions and their discretizations are all Gaussian processes. Our second contribution is to compute the exact Wasserstein errors between the target and the numerically sampled distributions for any numerical scheme. This allows us to monitor convergence directly in the data space, while experimental works limit their empirical analysis to Inception features. An implementation of our code is available online. ... While our theoretical analysis relies on an exactly known score function, we conduct additional experiments to assess the impact of the score approximation error. Surprisingly, in the context of texture synthesis, we show that with a score neural network trained for modeling a specific Gaussian microtexture, a stochastic Euler-Maruyama sampler is more faithful to the data distribution than Heun s method, thus highlighting the importance of the score approximation error in practical situations. ... We propose in Table 1 an ablation study to monitor the magnitude of each error and their accumulation for various sampling schemes for the CIFAR-10 example.
Researcher Affiliation Academia 1Université d Orléans, Université de Tours, CNRS, IDP, UMR 7013, Orléans, France 2Institut universitaire de France (IUF). Correspondence to: Emile Pierret <EMAIL>, Bruno Galerne <EMAIL>.
Pseudocode No The paper describes various numerical schemes (Euler-Maruyama, Exponential Integrator, DDPM, Euler, Heun, RK4) with their mathematical equations in Table 3 in Appendix E. However, these are mathematical formulations and not structured pseudocode or algorithm blocks.
Open Source Code No An implementation of our code is available online. ... Our source code is available online and can be applied to any Gaussian data distribution of interest and gives insight to calibrate parameters of a diffusion sampling algorithm, e.g. by straightforwardly generalizing our study to higher order linear numerical schemes. ... We train the network using the code2 associated with the paper (Song et al., 2021b). 2Code available at https://github.com/yang-song/ score_sde_pytorch. Although the paper states that 'Our source code is available online', no direct link or specific reference to supplementary material for their own code is provided within the document. The only specific link provided is for a third-party code used in their experiments.
Open Datasets Yes To illustrate our theoretical results, we consider the CIFAR-10 Gaussian distribution, that is, the Gaussian distribution such that Σ is the empirical covariance of the CIFAR-10 dataset.
Dataset Splits No The paper discusses using the CIFAR-10 dataset to derive a Gaussian distribution and describes generating samples from a Gaussian ADSN distribution for experiments. It mentions '25 samplings of 50K images' for empirical Wasserstein distance calculation. However, it does not specify any training, validation, or test splits for model training or evaluation in the traditional sense of dataset partitioning.
Hardware Specification No The paper does not provide specific details regarding the hardware (e.g., GPU models, CPU types, memory) used for running the experiments.
Software Dependencies No We train the network using the code2 associated with the paper (Song et al., 2021b). 2Code available at https://github.com/yang-song/ score_sde_pytorch. The paper mentions using code associated with another paper which implicitly uses PyTorch, but it does not specify exact version numbers for any software components (e.g., Python, PyTorch, CUDA) used in their own work.
Experiment Setup Yes We choose the architecture of DDPM, which is a U-Net described in (Ho et al., 2020), with the parameters proposed for the dataset Celeba HQ256 to deal with the 256 256 ADSN model associated with the top-left image of Figure 3. We use the training procedure corresponding to DDPM cont. in (Song et al., 2021b). β is linear from 0.05 to 10 with T = 1. We train over 1.3M iterations, and we generate at each iteration a new batch of ADSN samples. We implement the stochastic EM and deterministic Heun schemes replacing the exact score by its learned version with N = 1000 steps and a truncation time ε = 10 3.