Generalized Random Forests Using Fixed-Point Trees
Authors: David Fleischer, David A. Stephens, Archer Y. Yang
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate that our method achieves a speedup of multiple times over standard GRFs without compromising statistical accuracy. Experiments on both simulated and real-world data validate our approach. Our findings suggest that the proposed method is a scalable alternative for localized effect estimation in machine learning and causal inference applications. |
| Researcher Affiliation | Academia | 1Department of Mathematics and Statistics, Mc Gill University, Montreal, Canada 2Mila Quebec AI Institute, Montreal, Quebec, Canada. Correspondence to: Archer Y. Yang <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 The fixed-point tree algorithm... Algorithm 2 Stage I GRF-FPT: Training a generalized random forest using fixed-point trees... Algorithm 3 GRF-FPT: Estimates of θ (x) |
| Open Source Code | Yes | We implement the GRF-FPT algorithm in a fork of grf (Tibshirani et al., 2024) available at https://github.com/dfleis/grf... Code and data for reproducing all experiments and figures are available at https://github.com/dfleis/grf-experiments. |
| Open Datasets | Yes | The data, first appearing in Kelley Pace & Barry (1997), contains 20,640 observations of housing prices taken from the 1990 California census... can be directly obtained from the Carnegie Mellon Stat Lib repository (https://lib.stat.cmu.edu/datasets/houses.zip). |
| Dataset Splits | Yes | Throughout our experiments we use subsampling ratio s/n = 0.5...To assess estimation accuracy, we evaluate the mean squared error (MSE) of ŷ(x) across 50 replications of the model and testing on a separate set of 5,000 observations...We varied the subsampling ratio s/n {0.25, 0.50, 0.75} under VCM Setting 3 over a forest of 10 trees carried out using GRF-FPT2 and GRF-grad. |
| Hardware Specification | No | No specific hardware details (like GPU/CPU models, memory) are mentioned in the paper, only that the algorithm is implemented in an R package. |
| Software Dependencies | Yes | We implement the GRF-FPT algorithm in a fork of grf (Tibshirani et al., 2024) available at https://github.com/dfleis/grf. R package version 2.4.0. |
| Experiment Setup | Yes | Throughout our experiments we use subsampling ratio s/n = 0.5...All versions fit a forest of 2000 trees, the default settings of the original R implementation (Tibshirani et al., 2024), a subsample ratio of 0.5, and a target minimum node size of 5 observations. |