reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Infinite-dimensional Mahalanobis Distance with Applications to Kernelized Novelty Detection

Authors: Nikita Zozoulenko, Thomas Cass, Lukas Gonon

JMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In an empirical study on 12 real-world data sets, we demonstrate that the kernelized nearest-neighbour Mahalanobis distance outperforms the traditional kernelized Mahalanobis distance for multivariate time series novelty detection, using state-of-the-art time series kernels such as the signature, global alignment, and Volterra reservoir kernels.
Researcher Affiliation	Academia	Nikita Zozoulenko EMAIL Department of Mathematics Imperial College London UK Thomas Cass EMAIL Department of Mathematics Imperial College London UK Institute for Advanced Study Princeton, USA Lukas Gonon EMAIL School of Computer Science University of St. Gallen Switzerland Department of Mathematics Imperial College London UK
Pseudocode	Yes	The full procedure for computing the kernelized Mahalanobis distance and its nearest-neighbour variant (conformance score) is detailed in Algorithms 1 to 3.
Open Source Code	Yes	Our experiment code is publicly available at https://github. com/nikitazozoulenko/kernel-timeseries-anomaly-detection.
Open Datasets	Yes	We will use UEA multivariate time series repository (Bagnall et al., 2018; Ruiz et al., 2021) in our experimentation, which in recent years has become a standard benchmark for multivariate time series classiﬁcation.
Dataset Splits	Yes	The optimal kernel hyper-parameters were obtained separately for the Mahalanobis distance and the conformance score, via a 10 times repeated 4-fold cross-validation on the training data for each data set. The objective score used in the cross-validation was the sum of ROC-AUC and PR-AUC. When calculating the precision-recall metric, we let the non-outlier class be the positive class. The ﬁnal model was then evaluated on the out-of-sample test set to obtain the ﬁnal results, presented in Table 2 and Table 3.
Hardware Specification	No	In our open-source code we provide eﬃcient Py Torch implementations of each kernel on both GPU and CPU, as well as an implementation of Algorithms 1 to 3 for computing the kernelized Mahalanobis distance and the kernelized conformance score. We acknowledge computational resources and support provided by the Imperial College Research Computing Service (DOI: 10.14469/hpc/2232).
Software Dependencies	No	In our open-source code we provide eﬃcient Py Torch implementations of each kernel on both GPU and CPU, as well as an implementation of Algorithms 1 to 3 for computing the kernelized Mahalanobis distance and the kernelized conformance score.
Experiment Setup	Yes	For each kernel, data set, and class label, we run an extensive grid search on the designated training set using repeated k-fold cross-validation with 4 folds and 10 repeats to find the optimal kernel hyper-parameters. Let Rd be the state-space, and let T be the length of the time series for a given data set. For each method using the RBF static kernel we use the range σ 1/d{e 2, e 1, 1, e1, e2}, and similarly for the polynomial kernel we use p {2, 3, 4}, c {2, 1, 2, 4}. For the GAK kernel we use the previously speciﬁed σ without the d term, multiplied by T med(x y) as is recommended by Cuturi (2011). For the VRK kernel we use τ {1/2, 1}, and we let λ vary from 0.25 to 0.999 on an inverse logarithmic grid of size 10. The signature kernels inherit hyper-parameters from their respective static kernels. We additionally scale the kernel-lifted paths by s {1/2, 1, 2, 4} for the truncated signature, and s {1/4, 1/2, 1} for the untruncated PDE kernel. We use lower values for the PDE signature kernel since untruncated signatures essentially can be viewed as tensor exponentials, whose inner products will blow up if the input values are too big. For the truncated signature kernel we let the truncation level be in {1, 2, 3, 4, 5, 6, 7}. For the randomized signature we use the tanh activation function with number of features in {10, 25, 50, 100, 200}, and random matrix variances taken from a logarithmic grid of size 8 from 0.00001 to 1. Since the randomized signature is a randomized kernel, we perform the cross validation with 5 diﬀerent random seeds for the random matrix initializations, and take the best performing model (using the training set only), as is common practice for randomized methods. Furthermore, for each method we also cross-validate over the Tikhonov regularization parameter α {10 8, 10 5, 10 2}, whether to concatenate time as an additional dimension to each time series, and the eigenvalue threshold λ in Algorithm 1. We set an upper limit of 50 eigenvalues for the computation of the variance norm. For numerical stability, and to make the choices of α and λ be comparable across all kernels and data sets, we normalize all time series kernels K in feature space via K(x,y)/K(x,x)K(y,y).