Cultivating Archipelago of Forests: Evolving Robust Decision Trees Through Island Coevolution
Authors: Adam Zychowski, Andrew Perrault, Jacek Mańdziuk
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | ICo Evo RDF is evaluated on 20 benchmark datasets, demonstrating its superior performance compared to state-of-the-art methods in optimizing both adversarial accuracy and minimax regret. The flexibility of ICo Evo RDF allows for the integration of decision trees from various existing methods, providing a unified framework for combining diverse solutions. Our approach offers a promising direction for developing robust and interpretable machine learning models. |
| Researcher Affiliation | Academia | 1Faculty of Mathematics and Information Science, Warsaw University of Technology 2Department of Computer Science and Engineering, The Ohio State University 3Faculty of Computer Science, AGH University of Krakow 4Center of Excellence in Artificial Intelligence, AGH University of Krakow EMAIL, EMAIL, EMAIL |
| Pseudocode | No | A pseudocode of the proposed algorithm is presented in the supplementary material ( Zychowski, Perrault, and Ma ndziuk 2025). |
| Open Source Code | Yes | The Co Evo RDT source code is available at github.com/zychowskia/ICo Evo RDF. |
| Open Datasets | Yes | The proposed method was evaluated on 20 widely used classification benchmark problems with varying characteristics, including the number of instances (from 351 to 70000), features (4 3072), and perturbation coefficients ε (0.01 0.3). All selected datasets have been utilized in prior studies referenced in the Related Work section and are publicly available at https://www.openml.org. |
| Dataset Splits | No | The proposed method was evaluated on 20 widely used classification benchmark problems with varying characteristics... For input data selection, we propose to assign a unique training set to each island. It mirrors the approach of classic random forests and is achieved by sampling with replacement from X until each island s training set contains |X| instances. |
| Hardware Specification | Yes | All tests were executed on an Intel Xeon Silver 4116 processor operating at 2.10GHz. |
| Software Dependencies | No | We employed the Nashpy Python library (Knight and Campbell 2018) implementation of the Lemke-Howson algorithm for computing the Mixed Nash Equilibrium. |
| Experiment Setup | Yes | Parameterization. For the experiments conducted in this study, we employed the same set of parameters across all islands and their values followed recommendation from Co Evo RDT authors. Namely, decision tree population size NT = 200, perturbation population size NP = 500, number of consecutive generations for each population np = 20, number of best individuals from the DT population involved in the perturbations evaluation Ntop = 20, crossover probability pc = 0.8, mutation probability pm = 0.5, selection pressure ps = 0.9, elite size e = 2, Ho F size NHo F = 200. Generations without improvement limit lc was set to 100 and generation limit lg = 1000. Preliminary evaluations were conducted to assess various migration topologies, such as ring, star, and circle configurations. The ring topology (depicted in Figure 1) yielded the most favorable results and was thus used for all experiments presented in the subsequent section. A more comprehensive discussion and detailed results for different topologies can be found in the supplementary material ( Zychowski, Perrault, and Ma ndziuk 2025). The number of generations between migrations, ng, was set to 40. The number of islands, |I|, was fixed at 10 to maintain comparable computational time to the state-of-the-art (SOTA) method, PRAda Boost. |