Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Decision trees as partitioning machines to characterize their generalization properties
Authors: Jean-Samuel Leboeuf, Frédéric LeBlanc, Mario Marchand
NeurIPS 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We benchmark our pruning algorithm on 19 datasets taken from the UCI Machine Learning Repository [Dua and Graff, 2017]. We compare our pruning algorithm to CART s cost-complexity algorithm... Table 1 presents the results of the four models we tested. ... Mean test accuracy and standard deviation on 25 random splits of 19 datasets... |
| Researcher Affiliation | Academia | Jean-Samuel Leboeuf Department of Computer Science and Software Engineering Université Laval, Québec, QC, Canada EMAIL Frédéric Le Blanc Department of Mathematics and Statistics Université de Moncton, Moncton, NB, Canada EMAIL Mario Marchand Department of Computer Science and Software Engineering Université Laval, Québec, QC, Canada EMAIL |
| Pseudocode | Yes | The formal version of the algorithm is presented in Algorithm 3 of Appendix E.1. |
| Open Source Code | Yes | The source code used in the experiments and to produce the tables is freely available at the address https://github.com/jsleb333/ paper-decision-trees-as-partitioning-machines. |
| Open Datasets | Yes | We benchmark our pruning algorithm on 19 datasets taken from the UCI Machine Learning Repository [Dua and Graff, 2017]. |
| Dataset Splits | Yes | As such, we chose to randomly split each dataset so that the models are trained on 75% of the examples and tested on the remaining 25%. To limit the effect of the randomness of the splits, we run each experiment 25 times and we report the mean test accuracy and the standard deviation. |
| Hardware Specification | No | No specific hardware details (like GPU/CPU models, memory, or specific computing environments) used for running experiments were provided in the paper. It only states 'All experiments were done in pure Python.' |
| Software Dependencies | No | The paper mentions 'pure Python' and 'scikit-learn Python package' but does not provide specific version numbers for these or any other software dependencies, which are required for reproducibility. |
| Experiment Setup | Yes | The first model we consider is the greedily learned tree, grown using the Gini index until the tree has 100% classification accuracy on the training set or reaches 40 leaves. We impose this limit since the computation times for pruning trees become prohibitive for a large number of leaves. |