(De)-regularized Maximum Mean Discrepancy Gradient Flow

Authors: Zonghao Chen, Aratrika Mustafi, Pierre Glaser, Anna Korba, Arthur Gretton, Bharath K. Sriperumbudur

JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we demonstrate the superior empirical performance of the proposed Dr MMD descent in various experimental settings. 8.1 Three ring experiment... From Figure 2 Left and Middle, we can see that Dr MMD descent outperforms MMD, KALE, and χ2 descent in terms of all dissimilarity metrics with respect to the target π: MMD and Wasserstein-2 distance. Figure 1 is an animation plot visualizing the evolution of particles under these descent schemes...
Researcher Affiliation Academia Zonghao Chen EMAIL Department of Computer Science, University College London... Aratrika Mustafi EMAIL Department of Statistics, Pennsylvania State University... Pierre Glaser EMAIL Gatsby Computational Neuroscience Unit, University College London... Anna Korba EMAIL ENSAE, CREST, Institut Polytechnique de Paris... Arthur Gretton EMAIL Gatsby Computational Neuroscience Unit, University College London... Bharath K. Sriperumbudur EMAIL Department of Statistics, Pennsylvania State University.
Pseudocode Yes Algorithm 1 Dr MMD particle descent
Open Source Code Yes The code to reproduce all the experiments can be found in the following Git Hub repository. https://github.com/hudsonchen/Dr MMD.
Open Datasets No The paper uses synthetic datasets that are generated for the experiments (e.g., "target distribution π ( ) is defined on a manifold in R2 consisting of three non-overlapping rings. The initial source distribution µ0 ( ) is a Gaussian distribution" and "The data distribution Pdata is a uniform distribution on the sphere in Rp with p = 50"). No concrete access information or citations for existing public datasets are provided.
Dataset Splits Yes The data distribution Pdata is a uniform distribution on the sphere in Rp with p = 50. 2000 data are sampled from Pdata with 1000 as training dataset and another 1000 as validation dataset.
Hardware Specification No The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU models, or cloud computing instance details) used for running the experiments.
Software Dependencies No The paper mentions 'automatic differentiation libraries such as JAX (Bradbury et al., 2018)' but does not provide specific version numbers for JAX or any other key software components used in their implementation.
Experiment Setup Yes We sample N = M = 300 samples from the initial source and the target distributions and run Dr MMD descent with adaptive λ for nmax = 100, 000 iterations... we use a Gaussian kernel k(x, x ) = exp 0.5 x x 2/l2 with bandwidth l = 0.3. The step size for MMD descent is γ = 10 2 and the step size for KALE and Dr MMD descent is γ = 10 3. We enforce a positive lower bound λ = 10 3 for numerical stability and the regularity hyperparameter r is optimized over the set of {0.1, 0.5, 1.0}.