reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Continuous-Time Birth-Death MCMC for Bayesian Regression Tree Models

Authors: Reza Mohammadi, Matthew Pratola, Maurits Kaptein

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide theoretical support of the algorithm for Bayesian regression tree models and demonstrate its performance in a simulated example. Keywords: Bayesian Regression trees, Decision trees, Continuous-time MCMC, Bayesian structure learning, Birth-death process, Bayesian model averaging, Bayesian model selection. [...] 4. Empirical evaluation of our sampling approach We examine here the performance of the proposed CT-MCMC search algorithm based on a simulation scenario that is often used in the regression tree literature. [...] For each of the above search algorithms, we report the following measurements: MSE: This is the Mean Square Error. [...] Eﬀective Sample Size: This is the number of eﬀective independent draws that the algorithm generates.
Researcher Affiliation	Academia	Reza Mohammadi EMAIL Amsterdam Business School University of Amsterdam Amsterdam, The Netherlands Matthew Pratola EMAIL Department of Statistics The Ohio State University Ohio, USA Maurits Kaptein EMAIL Statistics and Research Methods University of Tilburg Tilburg, The Netherlands
Pseudocode	Yes	Algorithm 1 . CT-MCMC search algorithm Input: A tree (T, θT ), data D. [...] Algorithm 2 . CT-MCMC search algorithm exploiting conjugacy Input: A tree (T, θT ), data D.
Open Source Code	Yes	The current implementation of the methods proposed in this paper are available at https://bitbucket.org/mpratola/openbt.
Open Datasets	No	The synthetic data set consists of n = 300 data points with (x1, x2, x3) covariates where [...] The response y is calculated for n = 300 data points as: [...] To calculate the MSE, we generate another synthetic data set consists of n = 300 data points as a test set.
Dataset Splits	Yes	The synthetic data set consists of n = 300 data points with (x1, x2, x3) covariates where [...] The response y is calculated for n = 300 data points as: [...] To calculate the MSE, we generate another synthetic data set consists of n = 300 data points as a test set.
Hardware Specification	Yes	All the computations were carried out on a Mac Book Pro with 2.9 GHz processor and Quad-Core Intel Core i7.
Software Dependencies	No	We perform all the computations in R and the computationally intensive tasks are implemented in parallel in C and interfaced in R.
Experiment Setup	Yes	To evaluate the performance of the CT-MCMC search algorithm with compare with the RJ-MCMC, we run all the above search algorithms in the same conditions with 20,000 iterations and 1,000 as a burn-in. [...] Table 1 presents the results for σ2 = 1 which is a relatively challenging, high-noise, scenario.