reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Bayesian Leave-One-Out Cross-Validation Approximations for Gaussian Latent Variable Models

Authors: Aki Vehtari, Tommi Mononen, Ville Tolvanen, Tuomas Sivula, Ole Winther

JMLR 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical results show that the approach based upon a Gaussian approximation to the LOO marginal distribution (the so-called cavity distribution) gives the most accurate and reliable results among the fast methods. [...] The main conclusion from the empirical investigation (Section 4) is the observed superior accuracy/complexity tradeoﬀof Gaussian latent cavity distribution based LOO estimators. [...] Using several real data sets we present results illustrating the properties of the reviewed LOO-CV approximations.
Researcher Affiliation	Academia	Aki Vehtari EMAIL Tommi Mononen Ville Tolvanen Tuomas Sivula Helsinki Institute of Information Technology HIIT, Department of Computer Science, Aalto University P.O.Box 15400, 00076 Aalto, Finland. Ole Winther EMAIL Technical University of Denmark DK-2800 Lyngby, Denmark
Pseudocode	No	The paper describes various methods like Expectation Propagation and Laplace Approximation but does so descriptively and mathematically. It does not include any clearly labeled pseudocode blocks or algorithms.
Open Source Code	Yes	All the experiments were done using GPstuﬀtoolbox1 (Vanhatalo et al., 2013). 1. GPstuﬀis available at http://research.cs.aalto.ﬁ/pml/software/gpstuﬀ/
Open Datasets	Yes	Using several real data sets we present results illustrating the properties of the reviewed LOO-CV approximations. Table 3 lists the basic properties of four classiﬁcation data sets (Ripley, Australian, Ionosphere, Sonar), one survival data set with censoring (Leukemia), and one data set for a Student s t regression (Boston). All data sets are available from the internet.
Dataset Splits	Yes	The ground truth exact LOO results were obtained by brute force computation of each p(yi\|xi, D i) separately by leaving out the ith observation.
Hardware Specification	Yes	The speed comparisons were run with a laptop (Intel Core i5-4300U CPU @ 1.90GHz x 4 + 8GB memory).
Software Dependencies	No	The paper mentions GPstuﬀtoolbox (Vanhatalo et al., 2013) and GPML toolbox (Rasmussen and Nickisch, 2010), but does not specify exact version numbers for these software packages or any other dependencies.
Experiment Setup	Yes	For the classiﬁcation data sets we use a Bernoulli observation model with probit link. For the Leukemia data set we use a log-logistic model with censoring (as in Gelman et al., 2013, p. 511). For the Boston data set we use a Student s t observation model with ν = 4 degrees of freedom. A ﬁxed ν was chosen as the Laplace approximation (Vanhatalo et al., 2009) had occasional problems when integrating over an unknown ν.