Globally-Consistent Rule-Based Summary-Explanations for Machine Learning Models: Application to Credit-Risk Evaluation
Authors: Cynthia Rudin, Yaron Shaposhnik
JMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we apply these algorithms to multiple credit-risk models trained on the Explainable Machine Learning Challenge data from FICO and demonstrate that our approach effectively produces sparse summary-explanations of these models in seconds. Our approach is model-agnostic (that is, can be used to explain any predictive model), and solves a minimum set cover problem to construct its summaries. We conducted a computational study to assess the quality of the summary-explanations generated by our algorithms on a real-world dataset provided by FICO. |
| Researcher Affiliation | Academia | Cynthia Rudin EMAIL Departments of Computer Science and Electrical & Computer Engineering Duke University, Durham NC, USA Yaron Shaposhnik EMAIL Simon Business School University of Rochester, Rochester NY, USA |
| Pseudocode | Yes | Algorithm 1 Algorithm Cont Min Set Cover Input: data matrix X, observation to explain xe, and global model predictions {hglobal(xi)}N i=1, and hglobal(xe). Output: globally-consistent rule h R,τ,hglobal(xe). Algorithm 2 Algorithm Cont Max Support Input: γ (number of initial rules to extract), ws and wc (objective coefficients), X (data matrix), observation to explain xe, and predictions {hglobal(xi)}N i=1, hglobal(xe). Output: globally-consistent rule R, τ, hglobal(xe). |
| Open Source Code | Yes | Finally, we developed a simple programming interface in Python for applying our algorithms, which is publicly available at this link (see Appendix C for more information). A Python interface for our algorithms is publicly available online (link). Figure 25 illustrates a basic working example. |
| Open Datasets | Yes | We used the dataset for the Explainable Machine Learning Challenge (FICO 2018) for these examples. The dataset can be downloaded, see ref. FICO (2018). We use the US 1990 Census dataset (Meek et al. 2022) whereby we predict whether an individual s annual income exceeds $15,000 (d Rpincome 3) based on demographic and financial information. |
| Dataset Splits | Yes | We trained the aforementioned global models on a random training set consisting of 75% of all observations. ... Models were then evaluated on the test data (the remaining 25%). We split the dataset into a training set and a test set, each consisting of 50K observations. |
| Hardware Specification | No | The total running time of the experiment was approximately two weeks on 3 personal computers, setting a maximal timeout of 1 minute for solving IPs. |
| Software Dependencies | Yes | The code was developed in Python v3.7 and makes use of the following libraries: pandas, matplotlib, numpy, scikit learn, pulp (Mitchell et al. 2011), and Gurobi (Gurobi 2014). |
| Experiment Setup | Yes | To find summary-explanations on binary datasets, we first solve Bin Min Set Cover to find summary-explanations with optimal sparsity (denoted as Min Features ). Denoting by p OPT c the minimal number of features obtained by optimizing for the sparsity of the resulting rule, we then solve Bin Max Support 3 times, setting ws = 1, wc = 0, and pc to p OPT c , p OPT c + 1, and p OPT c + 2. That is, we maximize support and gradually relax the restriction on the maximal number of features (i.e., sparsity) of the resulting rules. The total running time of the experiment was approximately two weeks on 3 personal computers, setting a maximal timeout of 1 minute for solving IPs. Using the training set, we train and evaluate 3 global models: LR (max iter = 1000), RF (max depth = 7, min samples split = 5), and ANN (max iter = 1000, hidden layer sizes = (4, 2)). |