Efficient Learning of Interpretable Classification Rules
Authors: Bishwamittra Ghosh, Dmitry Malioutov, Kuldeep S. Meel
JAIR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our experiments, IMLI achieves the best balance among prediction accuracy, interpretability, and scalability. For instance, IMLI attains a competitive prediction accuracy and interpretability w.r.t. existing interpretable classifiers and demonstrates impressive scalability on large datasets where both interpretable and non-interpretable classifiers fail. As an application, we deploy IMLI in learning popular interpretable classifiers such as decision lists and decision sets. |
| Researcher Affiliation | Collaboration | Bishwamittra Ghosh EMAIL National University of Singapore Dmitry Malioutov EMAIL Kuldeep S. Meel EMAIL National University of Singapore |
| Pseudocode | Yes | Algorithm 1 Max SAT-based Mini-batch Learning... Algorithm 2 Iterative CNF Classifier Learning... Algorithm 3 Iterative learning of decision lists... Algorithm 4 Iterative learning of decision sets |
| Open Source Code | Yes | The source code is available at https://github.com/meelgroup/mlic. |
| Open Datasets | Yes | We experiment with real-world binary classification datasets from UCI (Dua & Graff, 2017), Open-ML (Vanschoren et al., 2013), and Kaggle repository (https://www.kaggle. com/datasets), as listed in Table 1. |
| Dataset Splits | Yes | We perform ten-fold cross-validation on each dataset and evaluate the performance of different classifiers based on the median prediction accuracy on the test data. |
| Hardware Specification | Yes | We conduct each experiment on an Intel Xeon E7 8857 v2 CPU using a single core with 16 GB of RAM running on a 64bit Linux distribution based on Debian. For all classifiers, we set the training timeout to 1000 seconds. |
| Software Dependencies | No | To implement IMLI, we deploy a state-of-the-art Max SAT solver Open-WBO (Martins et al., 2014), which returns the current best solution upon reaching a timeout. We compare IMLI with state-of-the-art interpretable and non-interpretable classifiers. Among interpretable classifiers, we compare with RIPPER (Cohen, 1995), BRL (Letham et al., 2015), CORELS (Angelino et al., 2017), and BRS (Wang et al., 2017). Among non-interpretable classifiers, we compare with Random Forest (RF), Support Vector Machine with linear kernels (SVM), Logistic Regression classifier (LR), and k-Nearest Neighbors classifier (k NN). We deploy the Scikit-learn library in Python for implementing noninterpretable classifiers. |
| Experiment Setup | Yes | For IMLI, we vary the number of clauses k {1, 2, . . . , 5} and the regularization parameter λ in a logarithmic grid by choosing 5 values between 10 4 and 101. In the mini-batch learning in IMLI, we set the number of samples in each mini-batch, n {50, 100, 200, 400}. |