ShortcutProbe: Probing Prediction Shortcuts for Learning Robust Models
Authors: Guangtao Zheng, Wenqian Ye, Aidong Zhang
IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We theoretically analyze the effectiveness of the framework and empirically demonstrate that it is an efficient and practical tool for improving a model s robustness to spurious bias on diverse datasets. Through extensive experiments, we show that our method successfully trains models robust to spurious biases without prior knowledge about these biases. Section 5: Experiments, Section 5.1: Datasets, Section 5.2: Experimental Setup, Section 5.3: Analysis of Probe Set, Section 5.4: Main Results (Tables 1, 2, 3), Section 5.5: Ablation Studies (Figure 3). |
| Researcher Affiliation | Academia | Guangtao Zheng , Wenqian Ye and Aidong Zhang University of Virginia EMAIL |
| Pseudocode | No | The paper describes the methodology using prose and mathematical equations. It mentions "Details of the training algorithm are provided in Appendix." but the appendix content is not provided in the analyzed text. Therefore, no structured pseudocode or algorithm blocks are present in the provided paper text. |
| Open Source Code | Yes | Code is available at https://github.com/gtzheng/Shortcut Probe. |
| Open Datasets | Yes | Waterbirds [Sagawa et al., 2019], Celeb A [Liu et al., 2015], Che Xpert [Irvin et al., 2019], Image Net-9 [Ilyas et al., 2019] is a subset of Image Net [Deng et al., 2009], Image Net-A [Hendrycks et al., 2021], NICO [He et al., 2021], Multi NLI [Williams et al., 2017], Civil Comments [Borkan et al., 2019]. |
| Dataset Splits | Yes | From the chosen data source, such as the training or validation set, we sorted the samples within each class by their prediction losses and divided them into two equal halves: a high-loss set and a low-loss set. ... Then, we retrained the model on half of the validation set using various bias mitigation methods. For our method, we first constructed the probe set using the same half of the validation set and used the probe set for shortcut detection and mitigation. The remaining half of the validation set was used for model selection and hyperparameter tuning. ... We prepared the training and validation data as in [Kim et al., 2022] and [Bahng et al., 2020]. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU or CPU models, memory specifications, or detailed computing environments used for the experiments. |
| Software Dependencies | No | The paper mentions using "Res Net-50 as the backbone network", "Res Net-18", and a "pretrained BERT model [Kenton and Toutanova, 2019]" but does not provide specific version numbers for these or any other core software libraries/frameworks. |
| Experiment Setup | Yes | We first trained a base model initialized with pretrained weights using empirical risk minimization (ERM) on the training dataset. Then, we retrained the model on half of the validation set... The remaining half of the validation set was used for model selection and hyperparameter tuning. ... ψ = arg min ψ Ldet + ηLreg, where η > 0 represents the regularization strength. ... θ 2 = arg min θ2 λLori/Lspu, where λ > 0 is the regularization strength. ... We retrain only the final classification layer of the model while keeping the feature extractor frozen. |