reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

NRGBoost: Energy-Based Generative Boosted Trees

Authors: João Bravo

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We explore generative extensions of these popular algorithms with a focus on explicitly modeling the data density (up to a normalization constant), thus enabling other applications besides sampling. As our main contribution we propose an energy-based generative boosting algorithm that is analogous to the secondorder boosting implemented in popular libraries like XGBoost. We show that, despite producing a generative model capable of handling inference tasks over any input variable, our proposed algorithm can achieve similar discriminative performance to GBDT on a number of real world tabular datasets, outperforming alternative generative approaches. At the same time, we show that it is also competitive with neural-network-based models for sampling. ... We evaluate NRGBoost on five tabular datasets from the UCI Machine Learning Repository (Dheeru & Karra Taniskidou, 2017): Abalone (AB), Physicochemical Properties of Protein Tertiary Structure (PR), Adult (AD), Mini Boo NE (MBNE) and Covertype (CT) as well as the California Housing (CH) dataset available through scikit-learn (Pedregosa et al., 2011). We also include a downsampled version of MNIST (by 2x along each dimension), which allows us to visually assess the quality of individual samples, something that is generally difficult with structured tabular data.
Researcher Affiliation	Industry	João Bravo Feedzai EMAIL
Pseudocode	Yes	In Algorithm 1 we provide a high-level overview of the training loop for NRGBoost.
Open Source Code	Yes	Code is available at https://github.com/ajoo/nrgboost.
Open Datasets	Yes	We evaluate NRGBoost on five tabular datasets from the UCI Machine Learning Repository (Dheeru & Karra Taniskidou, 2017): Abalone (AB), Physicochemical Properties of Protein Tertiary Structure (PR), Adult (AD), Mini Boo NE (MBNE) and Covertype (CT) as well as the California Housing (CH) dataset available through scikit-learn (Pedregosa et al., 2011). We also include a downsampled version of MNIST (by 2x along each dimension), which allows us to visually assess the quality of individual samples, something that is generally difficult with structured tabular data.
Dataset Splits	Yes	Table 5: Dataset Information. We respect the original test sets of each dataset when provided, otherwise we set aside 20% of the original dataset as a test set. 20% of the remaining data is set aside as a validation set used for hyperparameter tuning. ... For the single-variable inference evaluation, the best models are selected by their discriminative performance on a validation set. The entire setup is repeated five times with different cross-validation folds and with different seeds for all sources of randomness. For the Adult and MNIST datasets the test set is fixed but training and validation splits are still rotated.
Hardware Specification	Yes	The experiments were run on a Linux machine equipped with an AMD Ryzen 7 7700X 8 core CPU and 32 GB of RAM. The comparisons with TVAE and Tab DDPM additionally made use of a Ge Force RTX 3060 GPU with 12 GB of VRAM.
Software Dependencies	No	Our implementation of the proposed tree-based methods is mostly Python code using the Num Py library (Harris et al., 2020) and Numba. We implement the tree evaluation and Gibbs sampling in C, making use of the PCG library (O Neill, 2014) for random number generation.
Experiment Setup	Yes	We use random search to tune the hyperparameters of XGBoost and NGBoost and a grid search to tune the most important hyperparameters of each generative density model. We employ 5-fold crossvalidation, repeating the hyperparameter tuning on each fold. For the full details of the experimental protocol please refer to Appendix D. ... Appendix D.3 Hyperparameter Tuning