Optimal Weighted Random Forests
Authors: Xinyu Chen, Dalei Yu, Xinyu Zhang
JMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical studies conducted on real-world data sets and semi-synthetic data sets indicate that these algorithms outperform the equal-weight forest and two other weighted RFs proposed in the existing literature in most cases. |
| Researcher Affiliation | Academia | Xinyu Chen EMAIL International Institute of Finance School of Management University of Science and Technology of China Hefei, 230026, Anhui, China Dalei Yu EMAIL School of Mathematics and Statistics Xi an Jiaotong University Xi an, 710049, Shaanxi, China Xinyu Zhang EMAIL Academy of Mathematics and Systems Science Chinese Academy of Sciences Beijing, 100190, China International Institute of Finance School of Management University of Science and Technology of China Hefei, 230026, Anhui, China |
| Pseudocode | Yes | Algorithm 1: 1step-WRFopt Algorithm A.1: CART Algorithm B.1: 2steps-WRFopt |
| Open Source Code | Yes | The data and codes are available publicly on https://github.com/Xinyu Chen-hey/Optimal Weighted-Random-Forests. |
| Open Datasets | Yes | To assess the prediction performance of different weighted RFs in practical situations, we used 11 data sets from the UCI data repository for machine learning (Dua and Graff, 2017). Because most of these data sets are low-dimensional, one additional high-dimensional data set from openml.org (Vanschoren et al., 2013) was also included. |
| Dataset Splits | Yes | We randomly partitioned each data set into training data, testing data and validation data, in the ratio of 0.5 : 0.3 : 0.2. |
| Hardware Specification | No | The paper does not explicitly mention any specific hardware used for running its experiments. It only refers to computational resources generally, without providing details like CPU/GPU models, memory, or cloud instance types. |
| Software Dependencies | No | Many contemporary software packages, such as quadprog in R or MATLAB, can effectively handle quadratic programming problems. ...R package random Forest. ...generated additional attributes by sklearn.preprocessing.PolynomialFeatures (Pedregosa et al., 2011). |
| Experiment Setup | Yes | In this section, the number of trees Mn was set to 100. Before each split, the dimension of random feature sub-space q was set to p/3 , which is the default value in the regression mode of the R package random Forest. We set the minimum leaf size nodesize to n in CART trees and 5 in SUT trees, in order to control the depth of trees. |