Regularizing Black-box Models for Improved Interpretability

Authors: Gregory Plumb, Maruan Al-Shedivat, Ángel Alexander Cabrera, Adam Perer, Eric Xing, Ameet Talwalkar

NeurIPS 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate that post-hoc explanations for EXPO-regularized models have better explanation quality, as measured by the common fidelity and stability metrics. We verify that improving these metrics leads to significantly more useful explanations with a user study on a realistic task.4 Experimental Results
Researcher Affiliation Collaboration Gregory Plumb Carnegie Mellon University EMAIL Maruan Al-Shedivat Carnegie Mellon University EMAIL Ángel Alexander Cabrera Carnegie Mellon University EMAIL Adam Perer Carnegie Mellon University EMAIL Eric Xing CMU, Petuum Inc EMAIL Ameet Talwalkar CMU, Determined AI EMAIL
Pseudocode Yes Algorithm 1 Neighborhood-fidelity regularizer
Open Source Code Yes 1https://github.com/GDPlumb/ExpO
Open Datasets Yes seven regression problems from the UCI collection [Dheeru and Karra Taniskidou, 2017], the MSD dataset4, and Support2 which is an in-hospital mortality classification problem5. Dataset statistics are in Table 2. 4As in [Bloniarz et al., 2016], we treat the MSD dataset as a regression problem with the goal of predicting the release year of a song. 5http://biostat.mc.vanderbilt.edu/wiki/Main/Support Desc.
Dataset Splits No The paper mentions evaluating on test data and discusses parameters for neighborhoods (Nx and Nreg x), but it does not provide explicit train/validation/test dataset splits (percentages or counts) in the main text.
Hardware Specification Yes Each model takes less than a few minutes to train on an Intel 8700k CPU, so computational cost was not a limiting factor in our experiments.
Software Dependencies No The paper mentions using 'SGD with Adam [Kingma and Ba, 2014]' as an optimization algorithm, but it does not specify any software libraries, frameworks, or their version numbers (e.g., Python, TensorFlow, PyTorch, scikit-learn versions).
Experiment Setup Yes The network architectures and hyper-parameters are chosen using a grid search; for more details see Appendix A.3. For the final results, we set Nx to be N(x, σ) with σ = 0.1 and N reg x to be N(x, σ) with σ = 0.5.