Decomposing Global Feature Effects Based on Feature Interactions
Authors: Julia Herbinger, Marvin N. Wright, Thomas Nagler, Bernd Bischl, Giuseppe Casalicchio
JMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically evaluate the theoretical characteristics of the proposed methods based on various feature effect methods in different experimental settings. Moreover, we apply our introduced methodology to three real-world examples to showcase their usefulness. |
| Researcher Affiliation | Academia | 1 Department of Statistics, LMU Munich, Munich, Germany 2 Munich Center for Machine Learning (MCML), Munich, Germany 3 Leibniz Institute for Prevention Research and Epidemiology, Bremen, Germany 4 Faculty of Mathematics and Computer Science, University of Bremen, Bremen, Germany 5 Department of Public Health, University of Copenhagen, Copenhagen, Denmark |
| Pseudocode | Yes | Algorithm 1: Partitioning algorithm of GADGET; Algorithm 2: PINT |
| Open Source Code | Yes | All proposed methods and reproducible scripts for the experiments are available online via https://github.com/Julia Herbinger/gadget/. |
| Open Datasets | Yes | bikesharing data set (James et al., 2022); COMPAS data set... collected by Pro Publica (Larson et al., 2016); spam data set (Hopkins et al., 1999) |
| Dataset Splits | Yes | The R2 measured on a separately drawn test set of size 10000 following the same distribution is 0.94. (Section 4.3) ...measured by a 5-fold cross-validation, indicating good performance (Hopkins et al., 1999). (Section 9) |
| Hardware Specification | No | The paper describes various models (feed-forward neural network, GAM, XGBoost, random forest, SVM) and their configurations, but it does not specify the hardware (e.g., CPU, GPU models) used for training or evaluation. |
| Software Dependencies | No | The paper mentions 'R package version 1.3-2' for ISLR2 within a dataset citation, but this refers to a tool associated with a dataset, not the specific software dependencies with version numbers used for implementing the authors' methodology (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | We then draw 500 observations from these random variables and fit a feed-forward neural network (NN) with a single hidden layer of size 10 and weight decay of 0.001. (Section 4.3) As stopping criteria, we choose a maximum tree depth of 6, a minimum number of observations per leaf of 40, and set the improvement parameter γ to 0.2. (Section 6.1) |