Learning Kernel Tests Without Data Splitting

Authors: Jonas Kübler, Wittawat Jitkrittum, Bernhard Schölkopf, Krikamol Muandet

NeurIPS 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental At the same significance level, our approach s test power is empirically larger than that of the data-splitting approach, regardless of its split proportion.The empirical results suggest that, at the same significance level, the test power of our approach is larger than that of the data-splitting approach, regardless of the split proportion (cf. Section 5).We demonstrate the advantages of OST over data-splitting approaches and the Wald test with kernel two-sample testing problems as described in Section 2.
Researcher Affiliation Collaboration Jonas M. Kübler Wittawat Jitkrittum Bernhard Schölkopf Krikamol Muandet Max Planck Institute for Intelligent Systems, Tübingen, Germany EMAIL, EMAIL Now with Google Research
Pseudocode Yes Algorithm 1 One-Sided Test (OST)
Open Source Code Yes The code for the experiments is available at https://github.com/MPI-IS/tests-wo-splitting.
Open Datasets Yes 2. MNIST (p = 49): We consider downsampled 7x7 images of the MNIST dataset [40], where P contains all the digits and Q only uneven digits.
Dataset Splits No The paper discusses 'data splitting' where a portion of data is used for 'learning' and the rest for 'testing' (e.g., 'SPLIT0.1 denotes that 10% of the data are used for learning β and 90% are used for testing'). However, it does not explicitly define distinct training, validation, and test splits or provide specific percentages for a three-way split.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory, or cloud resources) used for running the experiments were provided in the paper.
Software Dependencies No The paper mentions 'Sci Py: Open source scientific tools for Python' [26] and 'The cvxopt linear and quadratic cone program solvers' [43], but no specific version numbers for these or other key software components are provided.
Experiment Setup Yes For all the setups we estimate the Type-II error for various sample sizes at a level α = 0.05. Error rates are estimated over 5000 independent trialsFor each dataset we consider three different base sets of kernels K and choose σ with the median heuristic: (a) d = 1: K = [k σ], (b) d = 2: K = [k σ, klin], (c) d = 6: K = [k0.25 σ, k0.5 σ, k σ, k2 σ, k4 σ, klin].