Feature Responsiveness Scores: Model-Agnostic Explanations for Recourse
Authors: Seung Hyun Cheon, Anneke Wernerfelt, Sorelle Friedler, Berk Ustun
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct an extensive empirical study on the responsiveness of explanations in lending. Our results show that standard practices in consumer finance can backfire by presenting consumers with reasons without recourse, and demonstrate how our approach improves consumer protection by highlighting responsive features and identifying fixed predictions. |
| Researcher Affiliation | Academia | Seung Hyun Cheon UC San Diego Anneke Wernerfelt Haverford College Sorelle A. Friedler Haverford College Berk Ustun UC San Diego |
| Pseudocode | Yes | Algorithm 1 Sample Reachable Points Algorithm 2 Enumerate Reachable Points |
| Open Source Code | Yes | We include a Python library to compute feature responsiveness scores available on Git Hub. |
| Open Datasets | Yes | We work with three publicly available consumer finance classification datasets. ... heloc n = 5, 842 d = 43 FICO [23] ... german n = 1, 000 d = 36 Dua & Graff [15] ... givemecredit n = 120, 268 d = 23 Kaggle [32] |
| Dataset Splits | Yes | We split each dataset into a training sample (80%; to train models and tune parameters) and a test sample (20%; to evaluate out-of-sample performance). |
| Hardware Specification | No | The paper does not explicitly describe the hardware used to run its experiments. |
| Software Dependencies | No | The paper mentions using a 'Python library' and various machine learning models (Logistic Regression, XGBoost, Random Forests, SHAP, LIME) but does not provide specific version numbers for any of these software components. |
| Experiment Setup | Yes | We fit models using (1) logistic regression (LR), (2) XGBoost (XGB), and (3) random forests (RF). For each model, we construct featurehighlighting explanations for each person who is denied credit that highlight up to four features... We chose the sample size N = 500 to ensure that the 100(1 α)% confidence interval in Appendix A.2 had an upper bound 0.01 when ˆµj(x) = 0 with α = 0.01. |