reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Entangled Kernels - Beyond Separability

Authors: Riikka Huusari, Hachem Kadri

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section we investigate the performance of our algorithm with real and artiﬁcial data sets. We start with an illustration of its behaviour, and move on to experiments on the predictive performance. In this latter setting, we compare our proposed Entangled Kernel Learning (EKL) algorithm to OKL (Dinuzzo et al., 2011); a kernel learning method for separable kernels (we use the code provided by the authors 7), and KRR; kernel ridge regression.
Researcher Affiliation	Academia	Riikka Huusari EMAIL Helsinki Institute for Information Technology HIIT Department of Computer Science Aalto University 02150 Espoo, Finland Hachem Kadri EMAIL Department of Computer Science Aix-Marseille University, CNRS, LIS 13013 Marseille, France
Pseudocode	Yes	Algorithm 1 Entangled Kernel Learning (EKL) Input: matrix of features Φ, labels Y // 1) Kernel learning: Solve for Q in eq.17 (D = QQ ) within a sphere manifold // 2) Learning the predictive function: if Predict with scalar-valued kernel then c K = (trp( ˆG) + λI) 1Y O(m3 + mnp) else c G = ( ˆG + λI) 1 vec(Y) O(r3 + mnp2) Return D = QQ , c
Open Source Code	Yes	6. Code is available at RH s personal website, riikkahuusari.com.
Open Datasets	Yes	We consider the following regression data sets: Concrete slump test9 with 103 data samples and three output variables; Sarcos 10 is a data set characterizing robot arm movements with 7 tasks; Weather11 has daily weather data (p = 365) from 35 stations. [...] Furthermore, we consider the u Wave Gesture data set12, a multi-view data set for classiﬁcation. 9. UCI data set repository 10. www.gaussianprocess.org/gpml/data/ 11. https://www.psych.mcgill.ca/misc/fda/ 12. http://www.cs.ucr.edu/ eamonn/time series data/
Dataset Splits	Yes	We took 25 data samples from each class as training samples, and further 25 from each for test data samples. (...) For Sarcos data set we consider the setting in Ciliberto et al. (2015) and show the results in Table 3 (predicting is done to all 5000 test samples). (...) The training set partition contains in total 896 data samples from the three views and eight classes. However as we want to investigate the case with large distinction between n and p, we only consider the smaller sets of data belonging to each class, with around 100 samples each which we divide into training and testing sets for the compared algorithms.
Hardware Specification	No	No specific hardware details (such as CPU, GPU models, or memory) are provided in the paper. The paper focuses on algorithmic complexities and experimental performance on various datasets without detailing the computational infrastructure used.
Software Dependencies	No	While the paper mentions using "scikit-learn" and "pymanopt.github.io" for implementation (Townsend et al., 2016), specific version numbers for these or other key software dependencies are not provided.
Experiment Setup	Yes	In all the experiments we cross-validate over various regularization parameters λ, and for EKL also the γs controlling the combination of alignments. (...) We trained EKL by restricting the number of columns in matrix Q, r, to two. (...) In the experiments we consider (normalized) mean squared error (n MSE) and normalized improvement to KRR (n I) (Ciliberto et al., 2015) as error measures.