The Computational Complexity of Understanding Binary Classifier Decisions

Authors: Stephan Waeldchen, Jan Macdonald, Sascha Hauch, Gitta Kutyniok

JAIR 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We show that the problem of deciding whether such subsets of relevant variables of limited size k d exist is complete for the complexity class NPPP and thus, generally, unfeasible to solve. We then introduce a variant, in which it suffices to check whether a subset determines the function value with probability at least δ or at most δ γ for 0 < γ < δ. This promise of a probability gap reduces the complexity to the class NPBPP. Finally, we show that finding the minimal set of relevant variables cannot be reasonably approximated, i.e. with an approximation factor d1 α for α > 0, by a polynomial time algorithm unless P = NP.
Researcher Affiliation Academia Stephan Wäldchen EMAIL Jan Macdonald EMAIL Sascha Hauch EMAIL Institut für Mathematik, Technische Universität Berlin, Berlin, Germany Gitta Kutyniok EMAIL Mathematisches Institut, Ludwig-Maximilians-Universität München, München, Germany
Pseudocode No The paper describes procedures as part of proofs and constructions (e.g., in Appendix C, 'We proceed according to the following iterative procedure: Start with the constant function Π0 0...'), but these are presented in paragraph form and mathematical notation rather than as clearly labeled pseudocode blocks or algorithms.
Open Source Code No The paper does not provide concrete access to source code for the methodology described. It mentions a companion paper where an algorithm is presented: 'In a companion paper (Macdonald et al., 2019) we present a heuristic algorithm for a continuous (non-discrete) variant of the δ-Relevant-Input problem and classifier functions with compositional (layered) structure (such as neural networks).'
Open Datasets No The paper is theoretical and defines computational problems related to Boolean functions and circuits; it does not present empirical results using specific datasets. While examples like 'image classification' are used to motivate the problem, no actual datasets are utilized or provided access for within this paper.
Dataset Splits No The paper is theoretical and does not conduct experiments with datasets, therefore no dataset split information is provided.
Hardware Specification No The paper presents theoretical work on computational complexity and does not describe any experiments that would require specific hardware. Therefore, no hardware specifications are provided.
Software Dependencies No The paper is theoretical and does not detail any experimental implementation. Therefore, no specific ancillary software details or version numbers are provided.
Experiment Setup No The paper focuses on theoretical computational complexity analysis and does not include any experimental results, thus no specific experimental setup details, hyperparameters, or training configurations are provided.