reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

On the Optimality of Gaussian Kernel Based Nonparametric Tests against Smooth Alternatives

Authors: Tong Li, Ming Yuan

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical experiments are also presented to further demonstrate the practical merits of the methodology. Keywords: Gaussian kernel embedding, maximum mean discrepancy (MMD), nonparametric tests, diverging scaling parameter, minimax optimality, adaptation
Researcher Affiliation	Academia	Tong Li EMAIL Ming Yuan EMAIL Department of Statistics Columbia University New York, NY 10027, USA
Pseudocode	No	The paper describes methods mathematically and in prose, but does not contain any clearly labeled pseudocode or algorithm blocks with structured steps.
Open Source Code	No	The paper discusses computational efficiency and refers to techniques developed in other works (Sutherland et al., 2017; Song et al., 2007), but it does not contain an explicit statement about releasing the authors' own source code or a link to a repository for the methodology described in this paper.
Open Datasets	Yes	Finally, we considered applying the proposed self-normalized adaptive test in a data example from Mooij et al. (2016). The data set consists of three variables, altitude (Alt), average temperature (Temp) and average duration of sunshine (Sun) from diﬀerent weather stations.
Dataset Splits	No	For Experiment I we ﬁxed the sample size at n = m = 200; and for Experiment II at n = 400. The number of permutations was set at 100, and signiﬁcance level at α = 0.05. ... The overall sample size of the data set is 349. Each time we randomly select 150 samples and compute the p-value associated with each DAG. The p-value is again computed based on 100 permutations. While random sampling and permutation numbers are specified, there are no explicit training/test/validation dataset splits (e.g., percentages or counts for distinct sets) provided for reproducibility.
Hardware Specification	No	The paper describes several numerical experiments and real-world data analysis, but it does not specify any particular hardware (e.g., CPU, GPU models, or cloud computing instances) used for these experiments.
Software Dependencies	No	The paper does not provide specific software names with version numbers for reproducibility. For example, it does not state 'Python 3.8' or 'PyTorch 1.9'.
Experiment Setup	Yes	For Experiment I we ﬁxed the sample size at n = m = 200; and for Experiment II at n = 400. The number of permutations was set at 100, and signiﬁcance level at α = 0.05. ... for Experiment III, the sample sizes were set to be m = n {25, 50, 75, , 200} and dimension d {1, 10, 100, 1000}; for Experiment IV, the sample size were n {100, 200, , 600} and dimension d {2, 10, 100, 1000}. In both experiments, we ﬁxed the signiﬁcance level at α = 0.05, did 100 permutations to calibrate the critical values as before. ... The overall sample size of the data set is 349. Each time we randomly select 150 samples and compute the p-value associated with each DAG. The p-value is again computed based on 100 permutations.