Distributional Diffusion Models with Scoring Rules
Authors: Valentin De Bortoli, Alexandre Galashov, J Swaroop Guntupalli, Guangyao Zhou, Kevin Patrick Murphy, Arthur Gretton, Arnaud Doucet
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments in Section 6 demonstrate that our approach produces high-quality samples with significantly fewer denoising steps. We observe substantial benefits across a range of tasks, including image-generation tasks in both pixel and latent spaces, and in robotics applications. |
| Researcher Affiliation | Collaboration | 1Google Deep Mind 2Gatsby UCL. Correspondence to: Valentin De Bortoli <EMAIL>, Alexandre Galashov <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Distributional Diffusion Model (training) Require: M training steps, schedule (αt, σt), distribution p0, weights θ0, batch size n, population size m for k = 1 : M do Sample ti Unif([0, 1]) for i [n] Sample Xi 0 i.i.d. p0 for i [n] Sample Xi ti pti|0( |Xi 0) for i [n] using (2) Sample ξi,j N(0, Id) for i [n], j [m] Set θk = θk 1 δ Ln,m(θ)|θk 1 using (14) end for Return: θM |
| Open Source Code | No | The paper does not provide explicit statements about releasing source code for the methodology, nor does it include links to a code repository. Appendix G contains pseudocode, but this is not equivalent to open-source code. |
| Open Datasets | Yes | We train conditional pixel-space models on CIFAR-10 (32x32x3) and on Celeb A (64x64x3), as well as unconditional pixel-space models on LSUN Bedrooms (64x64x3). We further use an autoencoder trained on Celeb A-HQ (256x256x3)... We experiment with diffusion policies on the Libero (Liu et al., 2024) benchmark... We ran initial experiments on Image Net (Russakovsky et al., 2015) with resolution 64 64 3 |
| Dataset Splits | No | The paper mentions that for the 2D experiments, 4096 samples were used for evaluation, and for image experiments, 50000 samples for CIFAR-10, LSUN Bedrooms, Celeb A, and 30000 for latent Celeb A-HQ were used for final FID computation. For robotics, Libero10 had 138090 steps of training data. However, it does not explicitly provide percentages or specific quantities for training, validation, and test splits for any of these datasets, nor does it cite the methodology for such splits. |
| Hardware Specification | Yes | As hardware, we use A100 GPU (40Gb of memory) with batch size = 16, H100 GPU (80Gb of memory) with batch size = 64 and TPUv5p (95Gb of memory) with batch size = 64 (per device with 4 devices in total). |
| Software Dependencies | No | The paper mentions using 'Adam optimizer', 'cosine warmup', 'JAX', 'Bert encoder', and 'Res Net' but does not provide specific version numbers for any of these software components or libraries. |
| Experiment Setup | Yes | We train diffusion model and distributional diffusion models for 100k steps with batch size 128 with learning rate 1e 3 using Adam optimizer with a cosine warmup for first 100 iterations. We use b1 = 0.9, b2 = 0.999, ϵ = 1e 8 in Adam optimizer. On top of that, we clip the updates by their global norm (with the maximal norm being 1). We use EMA decay of 0.99. We use the flow matching noise schedule (3) and we use safety epsilon 1e 2... For distributional models, we additionally sweep over λ {0, 0.1, 0.5, 1.0}, β {0.0001, 0.001, 0.01, 0.1, 0.5, 1.0, 1.5, 1.99, 2.0}. We use population size m = 4. |