reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Online model selection by learning how compositional kernels evolve

Authors: Eura Shin, Predrag Klasnja, Susan Murphy, Finale Doshi-Velez

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Using pilot data, we learn a set of kernel evolutions that can be used to quickly select kernels for new test users. KEM reliably selects high-performing kernels for a range of synthetic and real data sets, including two health data sets. 6 Experimental Setup 7.1 Demonstrative Results on Synthetic Data
Researcher Affiliation	Academia	Eura Shin EMAIL Department of Computer Science Harvard University Predrag Klasnja EMAIL School of Information University of Michigan Susan A. Murphy EMAIL Department of Computer Science Harvard University Finale Doshi-Velez ﬁnale@seas.harvard.edu Department of Computer Science Harvard University
Pseudocode	Yes	In algorithm 1, we deﬁne a selection model that leverages these learned evolutions to select a kernel for a new test user u at time t. Algorithm 1 Selection method for KEM
Open Source Code	No	The paper does not provide a direct link to a source-code repository or an explicit statement about the release of their code for the methodology described.
Open Datasets	Yes	The datasets used in our experiments, reﬂected in table 1, have diﬀerent properties. ... UCI: Energy (Tsanas & Xifara, 2012), Concrete (Yeh, 1998), Boston Housing (Harrison Jr & Rubinfeld, 1978), and Fires (Abid & Izeboudjen, 2019). ... Medical Information Mart for Intensive Care (MIMIC-III) data set Johnson et al. (2016) ... Heart Steps V1 (Klasnja et al., 2019)
Dataset Splits	Yes	10.2.2 Train/Test Splits: Synthetic Experiments: Training Users (10 training users) Testing Users (50 test users) ... The test set is composed of 200 uniformly spaced points along the x-axis from [0, 20]. Real Data Experiments: For the real data experiments, users were randomly assigned to either the training or testing set in a 50 : 50 split.
Hardware Specification	Yes	The reported runtimes are on a 4 core Intel Cascade Lake CPU.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers, such as programming language versions or library versions, needed to replicate the experiment.
Experiment Setup	Yes	10.2.5 KEM Priors details specific prior distributions and parameters for kernel hyperparameters (lengthscale, period, amplitude, observation noise) for both synthetic and real data. For example, 'Lengthscale: log p(θlengthscale) = N(0, 2) for synthetic data, log p(θlengthscale) = N(0.2, 0.5) for real data'. Also, 10.2.4 KEM Inference Parameters specifies: 'Gibbs sampling during pilot training: ... up to 200 iterations 10 iterations of global updates ... 100 samples from MH sampling algorithm...'