reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

MMD Aggregated Two-Sample Test

Authors: Antonin Schrab, Ilmun Kim, Mélisande Albert, Béatrice Laurent, Benjamin Guedj, Arthur Gretton

JMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate that MMDAgg signiﬁcantly outperforms alternative state-of-the-art MMD-based two-sample tests on synthetic data satisfying the Sobolev smoothness assumption, and that, on real-world image data, MMDAgg closely matches the power of tests leveraging the use of models such as neural networks.
Researcher Affiliation	Academia	Antonin Schrab EMAIL Centre for Artiﬁcial Intelligence, University College London & Inria London Gatsby Computational Neuroscience Unit, University College London London, WC1V 6LJ, UK Ilmun Kim EMAIL Department of Statistics & Data Science, Department of Applied Statistics, Yonsei University Seoul, 03722, South Korea M elisande Albert EMAIL Institut de Math ematiques de Toulouse; UMR 5219, Universit e de Toulouse; CNRS, INSA; France B eatrice Laurent EMAIL Institut de Math ematiques de Toulouse; UMR 5219, Universit e de Toulouse; CNRS, INSA; France Benjamin Guedj EMAIL Centre for Artiﬁcial Intelligence, University College London & Inria London London, WC1V 6LJ, UK Arthur Gretton EMAIL Gatsby Computational Neuroscience Unit, University College London London, W1T 4JG, UK
Pseudocode	Yes	Algorithm 1: MMDAgg Λw,B1:3 α
Open Source Code	Yes	We provide a user-friendly parameter-free implementation of MMDAgg, both in Jax and in Numpy, available at https://github.com/antoninschrab/mmdagg-paper. This repository also contains code for the reproducibility of our experiments.
Open Datasets	Yes	we consider the MNIST dataset (Le Cun et al., 2010) down-sampled to 7 × 7 images.
Dataset Splits	Yes	For the split and oracle tests, we use two equal halves of the data, and oracle is run on twice the sample sizes.
Hardware Specification	No	No specific hardware details (GPU/CPU models, processor types, or memory amounts) are provided for running the experiments.
Software Dependencies	No	We provide a user-friendly parameter-free implementation of MMDAgg, both in Jax and in Numpy, available at https://github.com/antoninschrab/mmdagg-paper. However, specific version numbers for these libraries are not provided.
Experiment Setup	Yes	We use level α = 0.05 for all our experiments. We use B1 = 2000 and B2 = 2000 simulated test statistics to estimate the quantiles and the probability in Equation (13) for the level correction, respectively, and use B3 = 50 steps of bisection method. For the median, split and oracle tests, we use B = 500 simulated test statistics to estimate the quantile.