reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Witty: An Efficient Solver for Computing Minimum-Size Decision Trees

Authors: Luca Pascal Staus, Christian Komusiewicz, Frank Sommer, Manuel Sorge

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this work, we test this paradigm empirically. We provide an implementation, augment it with extensive heuristic improvements, and scrutinize it on standard benchmark instances. The augmentations achieve a mean 324-fold (median 84-fold) speedup over the naive implementation. Compared to the state of the art they achieve a mean 32-fold (median 7-fold) speedup over the dynamic programming based Mur Tree solver and a mean 61-fold (median 25-fold) speedup over SAT-based implementations.
Researcher Affiliation	Academia	1Institute of Computer Science, Friedrich Schiller University Jena, Germany 2Institute of Logic and Computation, TU Wien, Austria EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1: Base witness-tree algorithm. Input: A witness tree W, a data set (E, λ), and a maximum size s N. Output: A perfect witness tree of size at most s or if none could be found. 1 Function Refine (W, (E, λ), s) 2 if W classifies (E, λ) then return W 3 if W has size s then return 4 e some dirty example from Dirty(W) 5 forall r = (v, i, t, e) Ref(W) do 6 Apply r to W to create a new witness tree R 7 R Refine (R, (E, λ), s) 8 if R = then return R
Open Source Code	Yes	Code, Data and Experimental Results https://doi.org/10.5281/zenodo.11235017
Open Datasets	Yes	We tested the solvers on standard benchmark data sets from the Penn Machine Learning Benchmarks (Romano et al. 2022). The data sets are part of the Penn Machine Learning Benchmarks (Romano et al. 2022).
Dataset Splits	Yes	Following Janota and Morgado (2020), we randomly sampled multiple subsets of the examples from each data set. Specifically, for each data set we chose 10 random subsets with 20% of the examples and 10 random subsets with 50% of the examples.
Hardware Specification	Yes	Our experiments were performed on servers with 24 GB RAM and two Intel(R) Xeon(R) E5540 CPUs with 2.53 GHz, 4 cores, 8 threads, running Java openjdk 11.0.19. Each individual experiment was allowed to use up to 12 GB RAM. All algorithms were executed on a single core.
Software Dependencies	Yes	running Java openjdk 11.0.19. We implemented our algorithm in Kotlin (see Zenodo). To solve the LP-relaxation of the Pair LB we used Gurobi 10.0.3 (Gurobi Optimization, LLC 2023).
Experiment Setup	No	No specific hyperparameters (e.g., learning rate, batch size, number of epochs, optimizer settings) for the algorithm or training process are explicitly provided in the main text. The paper describes the experimental environment and data splitting, but not specific configuration parameters for the proposed algorithm's operation beyond its structural components.