reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Composite Goodness-of-fit Tests with Kernels

Authors: Oscar Key, Arthur Gretton, François-Xavier Briol, Tamara Fernandez

JMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In the following section, we now study the performance of our tests in a range of settings. We consider three examples: (i) a multivariate Gaussian location and scale model, (ii) a generative model of the interaction of genes through time called a toggle-switch model, and (iii) an unnormalised nonparametric density model. For all experiments we set the test level α = 0.05, and give full implementation details in Section D.
Researcher Affiliation	Academia	Oscar Key EMAIL Centre for Artificial Intelligence, University College London Arthur Gretton EMAIL Gatsby Computational Neuroscience Unit, University College London François-Xavier Briol EMAIL Department of Statistical Science, University College London Tamara Fernandez EMAIL Faculty of Engineering and Science, Adolfo Ibañez University
Pseudocode	Yes	Algorithm 1: Wild bootstrap (MMD) ... Algorithm 2: Wild bootstrap (KSD) ... Algorithm 3: Parametric bootstrap
Open Source Code	Yes	The code to reproduce our experiments, and implement the tests for your own models, is available at: https://github.com/oscarkey/composite-tests
Open Datasets	Yes	Matsubara et al. (2022) use this model for a density estimation task on a data set of velocities of 82 galaxies (Postman et al., 1986; Roeder, 1990), and use the approximation with p = 25 for computational convenience.
Dataset Splits	No	The paper describes generating data for experiments and using existing datasets, but does not explicitly provide training/test/validation splits needed to reproduce the experiment. For example, in the Gaussian Model and Toggle-Switch Model sections, data is generated without explicit splits, and the Density Estimation section uses a single dataset without specifying reproduction splits.
Hardware Specification	Yes	The experiments are implemented in JAX, and executed on an Nvidia RTX 3090 GPU.
Software Dependencies	No	The experiments are implemented in JAX, and executed on an Nvidia RTX 3090 GPU. While JAX is mentioned as a software component, no specific version number is provided for JAX or any other libraries used.
Experiment Setup	Yes	For the experiments using the MMD, we minimize the minimum distance estimator using the Adam optimiser (Kingma and Ba, 2015). We use a learning rate of 0.05 and 100 iterations. When θ = µd (Figures 4 to 6) we sample the initial value of µd from N(0d, Id). When θ = {µ, σ2} (Figures 3 and 7) sample the initial µ from N(0, 1), and sample σ from a standard normal distribution truncated between 0.5 and 1.5. ... We use stochastic gradient descent with random restarts to estimate the parameters. The algorithm is: ... 3. Run SGD 15 times, once for each of the selected θi init, to ﬁnd ˆθ1 n, . . . , ˆθ15 n . Use learning rate 0.04 and perform 300 iterations.