Efficient Learning of Interpretable Classification Rules

Authors: Bishwamittra Ghosh, Dmitry Malioutov, Kuldeep S. Meel

JAIR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In our experiments, IMLI achieves the best balance among prediction accuracy, interpretability, and scalability. For instance, IMLI attains a competitive prediction accuracy and interpretability w.r.t. existing interpretable classifiers and demonstrates impressive scalability on large datasets where both interpretable and non-interpretable classifiers fail. As an application, we deploy IMLI in learning popular interpretable classifiers such as decision lists and decision sets.
Researcher Affiliation Collaboration Bishwamittra Ghosh EMAIL National University of Singapore Dmitry Malioutov EMAIL Kuldeep S. Meel EMAIL National University of Singapore
Pseudocode Yes Algorithm 1 Max SAT-based Mini-batch Learning... Algorithm 2 Iterative CNF Classifier Learning... Algorithm 3 Iterative learning of decision lists... Algorithm 4 Iterative learning of decision sets
Open Source Code Yes The source code is available at https://github.com/meelgroup/mlic.
Open Datasets Yes We experiment with real-world binary classification datasets from UCI (Dua & Graff, 2017), Open-ML (Vanschoren et al., 2013), and Kaggle repository (https://www.kaggle. com/datasets), as listed in Table 1.
Dataset Splits Yes We perform ten-fold cross-validation on each dataset and evaluate the performance of different classifiers based on the median prediction accuracy on the test data.
Hardware Specification Yes We conduct each experiment on an Intel Xeon E7 8857 v2 CPU using a single core with 16 GB of RAM running on a 64bit Linux distribution based on Debian. For all classifiers, we set the training timeout to 1000 seconds.
Software Dependencies No To implement IMLI, we deploy a state-of-the-art Max SAT solver Open-WBO (Martins et al., 2014), which returns the current best solution upon reaching a timeout. We compare IMLI with state-of-the-art interpretable and non-interpretable classifiers. Among interpretable classifiers, we compare with RIPPER (Cohen, 1995), BRL (Letham et al., 2015), CORELS (Angelino et al., 2017), and BRS (Wang et al., 2017). Among non-interpretable classifiers, we compare with Random Forest (RF), Support Vector Machine with linear kernels (SVM), Logistic Regression classifier (LR), and k-Nearest Neighbors classifier (k NN). We deploy the Scikit-learn library in Python for implementing noninterpretable classifiers.
Experiment Setup Yes For IMLI, we vary the number of clauses k {1, 2, . . . , 5} and the regularization parameter λ in a logarithmic grid by choosing 5 values between 10 4 and 101. In the mini-batch learning in IMLI, we set the number of samples in each mini-batch, n {50, 100, 200, 400}.