reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Provably Accurate Shapley Value Estimation via Leverage Score Sampling

Authors: Christopher Musco, R. Teal Witter

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Beyond theoretical guarantees, we find that Leverage SHAP achieves an approximately 50% reduction in error compared to the highly optimized implementation of Kernel SHAP in the widely used SHAP library [Lundberg & Lee, 2017]. More extensive experiments are included in Section 5. We find that the improvement of Leverage SHAP over Kernel SHAP is especially substantial in settings where n is large in comparison to m and when we only have access to noisy estimates of the Shapley values.
Researcher Affiliation	Academia	Christopher Musco New York University EMAIL R. Teal Witter New York University EMAIL
Pseudocode	Yes	Algorithm 1 Leverage SHAP Algorithm 2 Bernoulli Sample(n, c): Efficient Bernoulli Sampling Algorithm 3 Combo(n, s, i): Compute the ith Combination in Lexicographic Order
Open Source Code	Yes	All of our code is written in Python and can be found on Github3. [footnote 3: github.com/rtealwitter/leverageshap]
Open Datasets	Yes	We run our experiments on eight popular datasets from the SHAP library (Lundberg & Lee, 2017). We find that Leverage SHAP outperforms the highly optimized Kernel SHAP, achieving a 50% reduction in error on average (see Table 2). In addition, we find that Leverage SHAP consistently outperforms Permutation SHAP (see the additional experiments in Section G).
Dataset Splits	No	The paper mentions running experiments on eight popular datasets but does not specify how these datasets were split into training, validation, or test sets.
Hardware Specification	No	The paper does not provide any specific details about the hardware used for running the experiments, such as GPU or CPU models.
Software Dependencies	No	All of our code is written in Python and can be found on Github3. We use the SHAP library for the Optimized Kernel SHAP implementation, Tree SHAP, and the datasets. We use XGBoost for training and evaluating trees, under the default parameters. While Python, SHAP library, and XGBoost are mentioned, specific version numbers for these software components are not provided.
Experiment Setup	Yes	We use XGBoost for training and evaluating trees, under the default parameters. Figure 3 plots the performance for Kernel SHAP, the Optimized Kernel SHAP, and Leverage SHAP as the number of samples varies (we set m = 5n, 10n, 20n, . . . , 160n).