reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Distributed Bayesian Varying Coefficient Modeling Using a Gaussian Process Prior

Authors: Rajarshi Guhaniyogi, Cheng Li, Terrance D. Savitsky, Sanvesh Srivastava

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	This section evaluates the performance of methods based on the divide-and-conquer technique for inference and predictions in Bayesian VCMs using a simulation study and a real data analysis.
Researcher Affiliation	Academia	Rajarshi Guhaniyogi EMAIL Department of Statistics Texas A & M University College Station, TX 77843-3143, USA, Cheng Li EMAIL Department of Statistics and Data Science National University of Singapore Singapore 117546, Singapore, Terrance D. Savitsky EMAIL U.S. Bureau of Labor Statistics Office of Survey Methods Research Washington, DC 20212, USA, Sanvesh Srivastava EMAIL Department of Statistics and Actuarial Science University of Iowa Iowa City, IA 52242, USA
Pseudocode	Yes	In summary, the sampling algorithm for drawing from the posterior distribution of (α, Γ, τ 2, θ1, . . . , θq) starts from an initial value of parameters (α(0), Γ(0), τ 2(0), θ(0) 1 , . . . , θ(0) q ) and cycles through the following four steps for t = 0, 1, . . . , : (a) draw ν(t+1) given y, α(t), Γ(t), τ 2(t), θ(t) 1 , . . . , θ(t) q from N(µν, Σν), where µν, Σν are deﬁned in (28); (b) draw τ 2(t+1) given y, ν(t+1) using (32); (c) draw (α(t+1), γ(t+1)) given y, ν(t+1), τ 2(t+1) using (33) and the vectorization of γ(t+1) is reversed to obtain Γ(t+1); and (d) draw θ(t+1) 1 , . . . , θ(t+1) q given ν(t+1) using ESS, the likelihood in (36), and the relation between θa and θa in (35).
Open Source Code	No	Since there is no open source implementation available for the Bayesian VCMs with bivariate response vector, we implement it by ourselves following the DA-type algorithm discussed in Section 2.3. Additionally, Finley and Banerjee (2020) offer sp SVC in the sp Bayes R package for ﬁtting spatial VCMs, which are special cases of (1) with d = 2, and ﬁts to our simulation settings.
Open Datasets	Yes	The data on SST and SSS are obtained from the Hadley center observations under the met office in UK (www.metoffice.gov.uk/hadobs, more description available in Kennedy et al. (2011)).
Dataset Splits	Yes	In both simulations, the cardinality of the set of indexes U , where the function estimation and prediction are evaluated, is set at 300.
Hardware Specification	No	The paper mentions computational efficiency of algorithms and uses terms like 'massive data applications' and 'computationally challenging' but does not specify any particular hardware (e.g., GPU models, CPU types) used for running the experiments.
Software Dependencies	No	The paper mentions 'coda R package (Plummer, 2003)' for computing effective sample size and 'sp Bayes R package' as a competitor's tool. However, it does not provide specific version numbers for these or any other software dependencies used in their own experimental setup.
Experiment Setup	Yes	The sampling algorithm in each subset runs for 10,000 iterations, and the Markov chain is thinned by collecting every ﬁfth posterior sample after discarding the ﬁrst 5,000 posterior samples as burn-in. ... The sampling algorithm uses a sparse GP based on the FITC approximation with r = 400 inducing points (Qui nonero-Candela and Rasmussen, 2005; Alvarez et al., 2012).