Boulevard: Regularized Stochastic Gradient Boosted Trees and Their Limiting Distribution

Authors: Yichen Zhou, Giles Hooker

JMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental A simulation study and real world examples provide support for both the predictive accuracy of the model and its limiting behavior. Keywords: gradient boosting, regression tree, regularization, limiting distribution. ... We have conducted an empirical study to demonstrate the performance of Boulevard.
Researcher Affiliation Academia Yichen Zhou EMAIL Department of Statistics and Data Science Cornell University Ithaca, NY 14853, USA. Giles Hooker EMAIL Department of Statistics University of California, Berkeley Berkeley, CA 94720, USA.
Pseudocode Yes Algorithm 1 (Boulevard). ... Algorithm 2 (Trees for Non-adaptive Boosting). ... Algorithm 3 (Tail Snapshot Boulevard).
Open Source Code Yes The empirical study code is provided at: https://github.com/siriuz42/boulevard.git
Open Datasets Yes Results on four real world data sets selected from UCI Machine Learning Repository (Dheeru and Karra Taniskidou, 2017; T ufekci, 2014; Kaya et al., 2012) are shown in Figure 4.
Dataset Splits Yes All curves are averages after 5-fold cross validation. ... Figure 8 shows the result when we generate the 90% reproduction intervals for two real world datasets from UCI, namely CCPP and CASP. For each dataset, we take the first 10 examples as test examples, and split the rest of the dataset into 11 folds.
Hardware Specification No The paper does not explicitly describe the specific hardware used to run its experiments. It mentions 'simulation study' and 'empirical study' but no details on CPU, GPU, or other computing resources.
Software Dependencies No The paper does not provide specific version numbers for software dependencies or libraries used in the experiments. While it provides a link to code, the specific versions are not detailed within the paper itself.
Experiment Setup Yes Table 1: Parameters used in empirical study. label n θ ntree k λ MSE-(1-4) 5000 0.3 1000 20 0.8 MSE-Boston 506 0.8 1000 5 0.8 MSE-CCPP 9568 0.5 1000 50 0.8 MSE-CASP 20000 0.5 1000 50 0.8 MSE-Airfoil 1503 0.8 1000 40 0.8 Limiting-(1-4) 1000 0.8 2000 10 0.5 Variance-(1-4) 5000 0.8 3000 20 0.5 RI-(1-2) 1000 0.8 2000 10 0.5 RI-(3-4) 5000 0.8 2000 10 0.5