Deep MMD Gradient Flow without adversarial training

Authors: Alexandre Galashov, Valentin De Bortoli, Arthur Gretton

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We obtain competitive empirical performance in unconditional image generation on CIFAR10, MNIST, CELEB-A (64 x64) and LSUN Church (64 x 64). ... Section 7, we show that our method, Diffusion-MMD-gradient flow (DMMD), yields competitive performance in generative modeling on 2-D datasets as well as in unconditional image generation on CIFAR10 (Krizhevsky & Hinton, 2009), MNIST, CELEB-A, LSUN Church.
Researcher Affiliation Collaboration Alexandre Galashov UCL Gatsby Google Deep Mind EMAIL Valentin De Bortoli Google Deep Mind EMAIL Arthur Gretton UCL Gatsby Google Deep Mind EMAIL
Pseudocode Yes Algorithm 1 Train noise-conditional MMD discriminator. Algorithm 2 Noise-adaptive MMD gradient flow. Algorithm 3 Approximate noise-adaptive MMD gradient flow for a single particle. Algorithm 4 Noise-adaptive KALE flow for single particle.
Open Source Code No No explicit statement about code release or repository link for the methodology described in this paper was found.
Open Datasets Yes unconditional image generation on CIFAR10 (Krizhevsky & Hinton, 2009), MNIST, CELEB-A (64 x64) and LSUN Church (64 x 64). ... MNIST (Lecun et al., 1998), CELEB-A (64x64 (Liu et al., 2015) and LSUN-Church (64x64) (Yu et al., 2015).
Dataset Splits Yes For MNIST and CELEB-A, we use the same training/test split as well as the evaluation protocol as in (Franceschi et al., 2023). For LSUN Church, we compute FID on 50000 samples similar to DDPM (Ho et al., 2020). ... To select hyperparameters and track performance during training, we use FID evaluated on a subset of 1024 images from a training set of CIFAR10.
Hardware Specification Yes We used 1 a100 GPU with 40GB of memory to run these experiments. ... For all the experiments, we used A100 GPUs with 40 GB of memory.
Software Dependencies No No specific software dependencies with version numbers were found in the paper.
Experiment Setup Yes DMMD is trained for Niter = 250000 iterations with a batch size B = 64 with number Nnoise = 16 of noise levels per batch. We use a gradient penalty λ = 1.0 and ℓ2 regularisation strength λℓ2 = 0.1. ... We use η = 0.1 gradient flow learning rate, T = 10 number of noise levels, Np = 200 number of noisy particles, Ns = 5 number of gradient flow steps per noise level, tmin = 0.001 and tmax = 1 0.001. We use a batch of 400 clean particles during training.