(De)-regularized Maximum Mean Discrepancy Gradient Flow
Authors: Zonghao Chen, Aratrika Mustafi, Pierre Glaser, Anna Korba, Arthur Gretton, Bharath K. Sriperumbudur
JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we demonstrate the superior empirical performance of the proposed Dr MMD descent in various experimental settings. 8.1 Three ring experiment... From Figure 2 Left and Middle, we can see that Dr MMD descent outperforms MMD, KALE, and χ2 descent in terms of all dissimilarity metrics with respect to the target π: MMD and Wasserstein-2 distance. Figure 1 is an animation plot visualizing the evolution of particles under these descent schemes... |
| Researcher Affiliation | Academia | Zonghao Chen EMAIL Department of Computer Science, University College London... Aratrika Mustafi EMAIL Department of Statistics, Pennsylvania State University... Pierre Glaser EMAIL Gatsby Computational Neuroscience Unit, University College London... Anna Korba EMAIL ENSAE, CREST, Institut Polytechnique de Paris... Arthur Gretton EMAIL Gatsby Computational Neuroscience Unit, University College London... Bharath K. Sriperumbudur EMAIL Department of Statistics, Pennsylvania State University. |
| Pseudocode | Yes | Algorithm 1 Dr MMD particle descent |
| Open Source Code | Yes | The code to reproduce all the experiments can be found in the following Git Hub repository. https://github.com/hudsonchen/Dr MMD. |
| Open Datasets | No | The paper uses synthetic datasets that are generated for the experiments (e.g., "target distribution π ( ) is defined on a manifold in R2 consisting of three non-overlapping rings. The initial source distribution µ0 ( ) is a Gaussian distribution" and "The data distribution Pdata is a uniform distribution on the sphere in Rp with p = 50"). No concrete access information or citations for existing public datasets are provided. |
| Dataset Splits | Yes | The data distribution Pdata is a uniform distribution on the sphere in Rp with p = 50. 2000 data are sampled from Pdata with 1000 as training dataset and another 1000 as validation dataset. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU models, or cloud computing instance details) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'automatic differentiation libraries such as JAX (Bradbury et al., 2018)' but does not provide specific version numbers for JAX or any other key software components used in their implementation. |
| Experiment Setup | Yes | We sample N = M = 300 samples from the initial source and the target distributions and run Dr MMD descent with adaptive λ for nmax = 100, 000 iterations... we use a Gaussian kernel k(x, x ) = exp 0.5 x x 2/l2 with bandwidth l = 0.3. The step size for MMD descent is γ = 10 2 and the step size for KALE and Dr MMD descent is γ = 10 3. We enforce a positive lower bound λ = 10 3 for numerical stability and the regularity hyperparameter r is optimized over the set of {0.1, 0.5, 1.0}. |