reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Empirical Priors for Prediction in Sparse High-dimensional Linear Regression

Authors: Ryan Martin, Yiqi Tang

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, our numerical results demonstrate the proposed method s strong finite-sample performance in terms of prediction accuracy, uncertainty quantification, and computation time compared to existing Bayesian methods.
Researcher Affiliation	Academia	Ryan Martin EMAIL Yiqi Tang EMAIL Department of Statistics, North Carolina State University, 2311 Stinson Dr., Raleigh, NC 27695
Pseudocode	No	The paper describes the MCMC strategy as a numbered list of steps within the main text (e.g., '1. Given a current state S , sample Stmp q( \| S ).'), but this is not presented as a clearly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code	Yes	We have also developed an R package, ebreg (Tang and Martin, 2020), that provides users with tools for estimation and variable selection, as described in Martin et al. (2017), and the tools for prediction as presented here.
Open Datasets	Yes	This pharmacogenomics data set is publicly available in the NCI-60 database, and can be accessed via the R package mixOmics (Le Cao et al., 2016), data set multidrug.
Dataset Splits	Yes	The data set includes 60 samples, which we randomly split into a training and testing set of 75% and 25%, respectively.
Hardware Specification	No	The paper mentions computational efficiency comparisons (e.g., 'EB s run-time is about 20% of that for HS'), but it does not specify any particular hardware used for these experiments (e.g., CPU, GPU models, memory, or cloud resources).
Software Dependencies	Yes	We have also developed an R package, ebreg (Tang and Martin, 2020), that provides users with tools for estimation and variable selection, as described in Martin et al. (2017), and the tools for prediction as presented here. ... Tang and Martin. ebreg: An empirical Bayes method for sparse high-dimensional linear regression, 2020. URL https://CRAN.R-project.org/package=ebreg. R package version 0.1.2.
Experiment Setup	Yes	Our method, which we denote by EB, is as described in Section 3, with the following hyperparameter settings: the complexity prior qn for the size \|S\| in (3) uses a = 0.05 and c = 1; the posterior construction in Section 2 uses γ = 0.005 in (4) and α = 0.99 in (6); and the inverse gamma prior for σ2 described in Section 2.2 has shape and scale parameters a0 = 0.01 and b0 = 4, respectively. ... both the EB and HS methods return 5000 posterior predictive samples after a burn-in of 1000.