Boulevard: Regularized Stochastic Gradient Boosted Trees and Their Limiting Distribution
Authors: Yichen Zhou, Giles Hooker
JMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | A simulation study and real world examples provide support for both the predictive accuracy of the model and its limiting behavior. Keywords: gradient boosting, regression tree, regularization, limiting distribution. ... We have conducted an empirical study to demonstrate the performance of Boulevard. |
| Researcher Affiliation | Academia | Yichen Zhou EMAIL Department of Statistics and Data Science Cornell University Ithaca, NY 14853, USA. Giles Hooker EMAIL Department of Statistics University of California, Berkeley Berkeley, CA 94720, USA. |
| Pseudocode | Yes | Algorithm 1 (Boulevard). ... Algorithm 2 (Trees for Non-adaptive Boosting). ... Algorithm 3 (Tail Snapshot Boulevard). |
| Open Source Code | Yes | The empirical study code is provided at: https://github.com/siriuz42/boulevard.git |
| Open Datasets | Yes | Results on four real world data sets selected from UCI Machine Learning Repository (Dheeru and Karra Taniskidou, 2017; T ufekci, 2014; Kaya et al., 2012) are shown in Figure 4. |
| Dataset Splits | Yes | All curves are averages after 5-fold cross validation. ... Figure 8 shows the result when we generate the 90% reproduction intervals for two real world datasets from UCI, namely CCPP and CASP. For each dataset, we take the first 10 examples as test examples, and split the rest of the dataset into 11 folds. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used to run its experiments. It mentions 'simulation study' and 'empirical study' but no details on CPU, GPU, or other computing resources. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies or libraries used in the experiments. While it provides a link to code, the specific versions are not detailed within the paper itself. |
| Experiment Setup | Yes | Table 1: Parameters used in empirical study. label n θ ntree k λ MSE-(1-4) 5000 0.3 1000 20 0.8 MSE-Boston 506 0.8 1000 5 0.8 MSE-CCPP 9568 0.5 1000 50 0.8 MSE-CASP 20000 0.5 1000 50 0.8 MSE-Airfoil 1503 0.8 1000 40 0.8 Limiting-(1-4) 1000 0.8 2000 10 0.5 Variance-(1-4) 5000 0.8 3000 20 0.5 RI-(1-2) 1000 0.8 2000 10 0.5 RI-(3-4) 5000 0.8 2000 10 0.5 |