On the Robustness of Kernel Goodness-of-Fit Tests

Authors: Xing Liu, François-Xavier Briol

JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We will now evaluate the proposed GOF tests using both synthetic and real data.
Researcher Affiliation Collaboration Xing Liu EMAIL Quant Co François-Xavier Briol EMAIL Department of Statistical Science University College London
Pseudocode Yes Algorithm 1 Robust-KSD (R-KSD) test for goodness-of-fit evaluation.
Open Source Code Yes Code for reproducing all experiments can be found at github.com/Xing LLiu/robust-kernel-test.
Open Datasets Yes We use the data set as Matsubara et al. (2022); Key et al. (2025), which is a 1-dimensional data set of 82 galaxy velocities (Postman et al., 1986; Roeder, 1990).
Dataset Splits Yes To avoid using the same data for model training and testing, we randomly split the data into equal halves, each containing ndata = 41 data points.
Hardware Specification No The paper does not provide specific details about the hardware used, only mentioning execution times without specifying the processor, GPU, or memory.
Software Dependencies No The paper does not list specific software dependencies with version numbers.
Experiment Setup Yes Unless otherwise mentioned, all standard KSD tests are based on an IMQ kernel k(x, x ) = h IMQ(x x ) where h IMQ(u) = (1 + u 2 2/λ2) 1/2 with a bandwidth λ2 > 0 selected via the median heuristic, i.e., λmed = Median Xi Xj 2 : 1 i < j n . All tilted-KSD and robust-KSD tests are based on a tilted IMQ kernel with weight w(x) = (1 + x a 2 2/c) b, where a Rd and c > 0. We fix a = 0 and c = 1 in all experiments, as all data will always be centered and on a suitable scale. More generally, we could replace x a 2 2/c by a weighted norm of the form (x a) C(x a), where C Rd d is a pre-conditioning matrix, chosen possibly as the empirical covariance matrix or robust estimates of it. Since our experiments will focus on sub-Gaussian models, we choose b = 1/2. This ensures the Stein kernel is bounded. All tests have nominal level α = 0.05. The probability of rejection is computed by averaging over 100 repetitions, and the 95% confidence intervals are reported.