Most General Explanations of Tree Ensembles

Authors: Yacine Izza, Akexey Ignatiev, Sasha Rubin, Joao Marques-Silva, Peter J. Stuckey

IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This section presents a summary of empirical assessment of computing maximum inflated abductive explanations for tree ensembles the case study of RFmv and BT trained on some of the widely studied datasets. Our evaluation aims to investigate the following research questions: RQ1: Are hypercubes representing Max-i AXp explanations much larger than i AXp s on real world benchmarks? RQ2: Are the proposed logical encodings scale for practical RFs and BTs? RQ3: Does our algorithm converge quickly to deliver the optimal explanation?
Researcher Affiliation Academia Yacine Izza1 , Alexey Ignatiev2 , Sasha Rubin3 , Joao Marques-Silva4 , Peter J. Stuckey2,5 1CREATE, National University of Singapore, Singapore 2Monash University, Melbourne, Australia 3University of Sydney, Australia 4ICREA, University of Lleida, Spain 5OPTIMA ARC Industrial Training and Transformation Centre, Melbourne, Australia EMAIL, EMAIL, EMAIL, EMAIL, EMAIL. All listed affiliations are academic institutions or research centers.
Pseudocode Yes Algorithm 1 Computing maximum inflated AXp for TE Input: Expl. prob. E = (M, (v, c)); WCNF H, S Output: One Max-i AXp (X, E) 1: repeat 2: (µ, s) Max SAT(H, S) 3: E {S l j u Ii j | (yi l,u S).µ(yi l,u) = 1} 4: X {i F | |Ei| < |Di|} 5: has CEx Wi AXp(E; X, E) 6: if has CEx = true then 7: (Y, G) Findi CXp(E; F, E) 8: H H new Block Cl(Y, G) 9: until has CEx = false 10: return (X, E)
Open Source Code Yes The proposed approach is implemented in RFxpl1 and XReason2 Python packages. 1https://github.com/izzayacine/RFxpl 2https://github.com/alexeyignatiev/xreason
Open Datasets Yes The assessment of TEs (RFs and BTs) is performed on a selection of publicly available datasets, which originate from UCI ML Repository [UCI, 2020] and Penn ML Benchmarks [Olson et al., 2017] in total 12 datasets.
Dataset Splits No For each dataset, we randomly pick 25 instances to test. This describes the selection of instances for explanation generation but does not provide specific training/validation/test splits for the models themselves.
Hardware Specification Yes The experiments are conducted on Intel Core i5-10500 3.1GHz CPU with 16GByte RAM running Ubuntu 22.04 LTS.
Software Dependencies Yes The Py SAT toolkit [Ignatiev et al., 2018; Ignatiev et al., 2024] is used to instrument SAT or/and Max SAT oracle calls. RC2 [Ignatiev et al., 2019a] Max SAT solver, which is implemented in Py SAT, is applied to all Max SAT encodings (for the TE operation and hitting set dualization). Moreover, Gurobi [Gurobi Optimization, LLC, 2023] is applied to instrument MIP oracle calls using its Python interface.
Experiment Setup No The paper states: "When training RFs, we used Scikit-learn [Pedregosa and et al., 2011] and for BTs we applied XGBoost [Chen and Guestrin, 2016]." However, it does not specify any hyperparameters (e.g., number of trees, max depth, learning rate) or other detailed training configurations used for these models.