reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

On Computing Probabilistic Explanations for Decision Trees

Authors: Marcelo Arenas, Pablo Barcelo, Alexander Kozachinskiy, Miguel Romero, Bernardo Subercaseaux

JAIR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	Our paper settles the computational complexity of 𝛿-sufficient-reasons over decision trees, showing that both (1) finding 𝛿-sufficient-reasons that are minimal in size, and (2) finding 𝛿-sufficient-reasons that are minimal inclusion-wise, are computationally intractable. By doing this, we answer two open problems originally raised by Izza et al. (2021), and extend the hardness of explanations for Boolean circuits presented by Wäldchen et al. (2021) to the more restricted case of decision trees. Furthermore, we present sharp non-approximability results under a widely believed complexity hypothesis. On the positive side, we identify structural restrictions of decision trees that make the problem tractable.
Researcher Affiliation	Academia	MARCELO ARENAS, Department of Computer Science and Institute for Mathematical and Computational Engineering, Faculty of Engineering, Catholic University of Chile, Chile and IMFD, Chile PABLO BARCELÓ, Institite for Mathematical and Computational Engineering, Faculty of Engineering and Faculty of Mathematics, Catholic University of Chile, Chile, IMFD, Chile, and CENIA, Chile ALEXANDER KOZACHINSKIY, CENIA, Chile MIGUEL ROMERO, Department of Computer Science, Faculty of Engineering, Catholic University of Chile, Chile and CENIA, Chile BERNARDO SUBERCASEAUX, School of Computer Science, Carnegie Mellon University, USA
Pseudocode	Yes	Algorithm 2 Minimal Sufficient Reason 1: Input: Decision tree 𝑇and instance 𝒙, both of dimension 𝑛 2: Output: A minimal sufficient reason 𝒚for 𝒙under 𝑇 3: 𝒚 𝒙 4: for 𝑖 {1, . . . ,𝑛} do 5: ˆ𝒚 𝒚 6: ˆ𝒚[𝑖] 7: if Check Sufficient Reason(𝑇, ˆ𝒚, 𝒙) then 8: 𝒚 ˆ𝒚 9: end if 10: end for 11: return 𝒚
Open Source Code	No	The paper does not contain any explicit statement about releasing source code or a link to a code repository for the methodology described.
Open Datasets	No	The paper provides examples of decision trees (e.g., Figures 1, 2, 3) to illustrate concepts, but these are small, illustrative examples and not publicly available datasets used for empirical evaluation. There is no mention of specific datasets, links, DOIs, or citations for external datasets used in experiments.
Dataset Splits	No	The paper focuses on theoretical computational complexity, hardness results, and tractability results for algorithms. It does not present empirical studies with datasets, therefore, there is no information about dataset splits.
Hardware Specification	No	The paper focuses on theoretical computational complexity, hardness results, and tractability results for algorithms. It does not describe any experimental setup or report empirical results that would require specific hardware specifications.
Software Dependencies	No	The paper discusses theoretical algorithms and their computational complexity. It does not describe any experimental implementation or specific software dependencies with version numbers required for replication.
Experiment Setup	No	The paper is theoretical, focusing on computational complexity and proofs for probabilistic explanations in decision trees. It does not include an experimental section detailing hyperparameters, training configurations, or system-level settings.