Composite Goodness-of-fit Tests with Kernels

Authors: Oscar Key, Arthur Gretton, François-Xavier Briol, Tamara Fernandez

JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In the following section, we now study the performance of our tests in a range of settings. We consider three examples: (i) a multivariate Gaussian location and scale model, (ii) a generative model of the interaction of genes through time called a toggle-switch model, and (iii) an unnormalised nonparametric density model. For all experiments we set the test level α = 0.05, and give full implementation details in Section D.
Researcher Affiliation Academia Oscar Key EMAIL Centre for Artificial Intelligence, University College London Arthur Gretton EMAIL Gatsby Computational Neuroscience Unit, University College London François-Xavier Briol EMAIL Department of Statistical Science, University College London Tamara Fernandez EMAIL Faculty of Engineering and Science, Adolfo Ibañez University
Pseudocode Yes Algorithm 1: Wild bootstrap (MMD) ... Algorithm 2: Wild bootstrap (KSD) ... Algorithm 3: Parametric bootstrap
Open Source Code Yes The code to reproduce our experiments, and implement the tests for your own models, is available at: https://github.com/oscarkey/composite-tests
Open Datasets Yes Matsubara et al. (2022) use this model for a density estimation task on a data set of velocities of 82 galaxies (Postman et al., 1986; Roeder, 1990), and use the approximation with p = 25 for computational convenience.
Dataset Splits No The paper describes generating data for experiments and using existing datasets, but does not explicitly provide training/test/validation splits needed to reproduce the experiment. For example, in the Gaussian Model and Toggle-Switch Model sections, data is generated without explicit splits, and the Density Estimation section uses a single dataset without specifying reproduction splits.
Hardware Specification Yes The experiments are implemented in JAX, and executed on an Nvidia RTX 3090 GPU.
Software Dependencies No The experiments are implemented in JAX, and executed on an Nvidia RTX 3090 GPU. While JAX is mentioned as a software component, no specific version number is provided for JAX or any other libraries used.
Experiment Setup Yes For the experiments using the MMD, we minimize the minimum distance estimator using the Adam optimiser (Kingma and Ba, 2015). We use a learning rate of 0.05 and 100 iterations. When θ = µd (Figures 4 to 6) we sample the initial value of µd from N(0d, Id). When θ = {µ, σ2} (Figures 3 and 7) sample the initial µ from N(0, 1), and sample σ from a standard normal distribution truncated between 0.5 and 1.5. ... We use stochastic gradient descent with random restarts to estimate the parameters. The algorithm is: ... 3. Run SGD 15 times, once for each of the selected θi init, to find ˆθ1 n, . . . , ˆθ15 n . Use learning rate 0.04 and perform 300 iterations.