On Integrating Logical Analysis of Data into Random Forests
Authors: David Ing, Said Jabbour, Lakhdar Saïs
IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we conduct comparative experiments between our Random Forest using MSSes of LAD (RF-LAD) and the state-of-the-art Random Forest (RF) (from the scikit-learn Python library [Pedregosa et al., 2011]). The assessment is performed on 23 datasets, which are standard benchmarks originating from well-known repositories such as CP4IM (https: //dtai-static.cs.kuleuven.be/CP4IM/datasets/), Kaggle (www. kaggle.com), Open ML (www.openml.org), and UCI (archive. ics.uci.edu/ml/). |
| Researcher Affiliation | Academia | David Ing , Said Jabbour , Lakhdar Sa ıs CRIL, CNRS Universit e d Artois, France EMAIL |
| Pseudocode | Yes | Algorithm 1: Classical Random Forest, Algorithm 2: Random Forest Based LAD (RF-LAD) |
| Open Source Code | No | To derive such explanations from those RFs, we utilized the recent Random Forest explanation tool (RFxpl: https://github.com/izzayacine/RFxpl) proposed by Izza and Marques-Silva [Izza and Marques-Silva, 2021]. This describes a third-party tool used by the authors, not the release of their own code for RF-LAD. |
| Open Datasets | Yes | The assessment is performed on 23 datasets, which are standard benchmarks originating from well-known repositories such as CP4IM (https: //dtai-static.cs.kuleuven.be/CP4IM/datasets/), Kaggle (www. kaggle.com), Open ML (www.openml.org), and UCI (archive. ics.uci.edu/ml/). |
| Dataset Splits | Yes | For every benchmark, a Repeated Stratified 10-fold cross-validation with 3 repetitions have been achieved to maintain the class distribution (i.e. to address imbalanced datasets). |
| Hardware Specification | Yes | Experiments are conducted on a computer equipped with Intel(R) Core(TM) i9-10900 CPU @ 2.80GHz with 62Gib of memory. |
| Software Dependencies | No | In this section, we conduct comparative experiments between our Random Forest using MSSes of LAD (RF-LAD) and the state-of-the-art Random Forest (RF) (from the scikit-learn Python library [Pedregosa et al., 2011]). To enumerate the MSSes, we use the multithreaded implementations provided by the algorithm p MMCS [Murakami and Uno, 2014], and set the number of threads to 20 to ensure diversification in terms of the generated MSSes. Specific version numbers for these software components are not provided. |
| Experiment Setup | Yes | The parameters of RFs are kept at their default values (i.e. in a forest, K = 100 DTs), except for the maximum depth, where we set different depths d {3, 4, 5}. To control the complexity, we modified p MMCS by limiting the number of MSSes to 100K (i.e. NS = 100K) for all the considered datasets. For a fair comparison, we randomly select 100 different MSSes (without redundancy) from the 100K generated MSSes, resulting in 100 different DTs to build our RF-LAD. |