LIMIS: Locally Interpretable Modeling using Instance-wise Subsampling
Authors: Jinsung Yoon, Sercan O Arik, Tomas Pfister
TMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show on multiple tabular datasets that LIMIS near-matches the prediction accuracy of black-box models, significantly outperforming state-of-the-art locally interpretable models in terms of fidelity and prediction accuracy. We next study LIMIS on 3 real-world regression datasets: (1) Blog Feedback, (2) Facebook Comment, (3) News Popularity; and 2 real-world classification datasets: (4) Adult Income, (5) Weather. We evaluate the performance on disjoint testing sets Dt = {(xt k, yt k)}L k=1 P and report the results over 10 independent runs. |
| Researcher Affiliation | Industry | Jinsung Yoon, Sercan Ö. Arik, Tomas Pfister EMAIL Google Cloud AI |
| Pseudocode | Yes | Pseudo-code of the LIMIS training is in Algorithm. 1. Pseudo-code of the LIMIS inference is in Algorithm. 2. |
| Open Source Code | No | The paper does not provide concrete access to source code for the LIMIS methodology described. It only provides links to benchmark models (LIME, SILO, MAPLE) in Appendix C: Implementations of benchmark models. There is no explicit statement or link for LIMIS code. |
| Open Datasets | Yes | We next study LIMIS on 3 real-world regression datasets: (1) Blog Feedback, (2) Facebook Comment, (3) News Popularity; and 2 real-world classification datasets: (4) Adult Income, (5) Weather. These are well-known, publicly available benchmark datasets. |
| Dataset Splits | No | We evaluate the performance on disjoint testing sets Dt = {(xt k, yt k)}L k=1 P and report the results over 10 independent runs. If there is no explicit probe dataset, it can be randomly split from the training dataset (D). The paper mentions using training, probe, and test sets, and notes that a probe dataset can be randomly split from training data. However, it does not specify concrete percentages, absolute counts, or reference predefined splits for the datasets used. |
| Hardware Specification | Yes | On a single NVIDIA V100 GPU (without any hardware optimizations), LIMIS yields a training time of less than 5 hours (including Stage 1, 2 and 3) and an interpretable inference time of less than 10 seconds per testing instance. Training time is computed on a single K80 GPU until the model convergence (i.e., no more validation fidelity improvements). |
| Software Dependencies | No | The paper lists various predictive models and their hyperparameters (e.g., XGBoost, Light GBM, MLP, Ridge Regression) in Appendix A, along with optimizers and activation functions. However, it does not specify version numbers for any software libraries, frameworks, or languages used (e.g., 'PyTorch 1.9', 'Python 3.8'). |
| Experiment Setup | Yes | Hyper-parameters are optimized to maximize the validation fidelity. Appendix A: Hyper-parameters of the predictive models, details hyperparameters for XGBoost (booster gbtree, max depth 6, learning rate 0.3, number of estimators 1000, reg alpha 0), Light GBM (booster gbdt, learning rate 0.1, number of estimators 1000, min data in leaf 20), Random Forests (number of estimators 1000, criterion gini), Multi-layer Perceptron (Number of layers 4, hidden units [feature dimensions, feature dimensions/2, feature dimensions/4, feature dimensions/8], activation function Re LU, early stopping True with patience 10, batch size 256, maximum number of epochs 200, optimizer Adam), and others. |