Explanation Shift: How Did the Distribution Shift Impact the Model?

Authors: Carlos Mougan, Klaus Broelemann, Gjergji Kasneci, Thanassis Tiropanis, Steffen Staab

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide theoretical and experimental evidence and demonstrate the effectiveness of our approach on synthetic and real data. Additionally, we release an open-source Python package, skshift, which implements our method and provides usage tutorials for further reproducibility.
Researcher Affiliation Collaboration Carlos Mougan AI Office European Commission & University of Southampton. Klaus Broelemann Schufa Holding AG, Germany Gjergji Kasneci Schufa Holding AG & Technical University of Munich Thanassis Tiropanis University of Southampton Steffen Staab University of Stuttgart & University of Southampton
Pseudocode No The paper describes methods and uses mathematical formulations but does not contain a clearly labeled pseudocode or algorithm block.
Open Source Code Yes Additionally, we release an open-source Python package, skshift, which implements our method and provides usage tutorials for further reproducibility. To ensure reproducibility, we make the data, code repositories, and experiments publicly available https://github.com/cmougan/Explanation Shift. Also, an open-source Python package skshift, available at: https://skshift.readthedocs.io/
Open Datasets Yes In the main body of the paper we base our comparisons on the UCI Adult Income dataset Dua & Graff (2017) and on synthetic data. In the Appendix, we extend experiments to several other datasets, which confirm our findings: ACS Travel Time, ACS Employment, Stackoverflow dataset (Stackoverflow, 2019).
Dataset Splits Yes The model gψ is trained each time on each state using only the Dnew X in the absence of the label, and a 50/50 random train-test split evaluates its performance.
Hardware Specification Yes Experiments were run on a 4 v CPU server with 32 GB RAM.
Software Dependencies Yes We used shap version 0.41.0 and lime version 0.2.0.1 as software packages.
Experiment Setup Yes We train the fθ on Dtr,ρ=0 using a gradient-boosted decision tree, while for gψ : S(fθ, Dval,ρ X ) {0, 1}, we train on different datasests with different values of ρ. For gψ we use a logistic regression. In this experiment, we changed the hyperparameters of the original model: for the decision tree, we varied the depth of the tree, while for the gradient-boosted decision trees, we changed the number of estimators, and for the random forest, both hyperparameters.