Towards Formalizing Spuriousness of Biased Datasets Using Partial Information Decomposition
Authors: Barproda Halder, Faisal Hamman, Pasan Dissanayake, Qiuyi Zhang, Ilia Sucholutsky, Sanghamitra Dutta
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we also perform empirical evaluation to demonstrate the trends of unique, redundant, and synergistic information, as well as our proposed spuriousness measure across 6 benchmark datasets under various experimental settings. We observe an agreement between our preemptive measure of dataset spuriousness and post-training model generalization metrics such as worst-group accuracy, further supporting our proposition. |
| Researcher Affiliation | Collaboration | Barproda Halder EMAIL Department of Electrical and Computer Engineering University of Maryland, College Park [...] Qiuyi Zhang EMAIL Google Research [...] Ilia Sucholutsky EMAIL Department of Computer Science Princeton University |
| Pseudocode | Yes | Algorithm 1: Spuriousness Disentangler: An Autoencoder-Based Explainability Framework |
| Open Source Code | Yes | The code is available at https://github.com/Barproda/spuriousness-disentangler. |
| Open Datasets | Yes | Our evaluation spans six datasets: Waterbird (Wah et al., 2011), Adult (Becker & Kohavi, 1996), Celeb A (Lee et al., 2020), Dominoes (Shah et al., 2020), Spawrious (Lynch et al., 2023), and Colored MNIST (Arjovsky et al., 2019). |
| Dataset Splits | Yes | Table 5: Summary of the datasets (Waterbird Train 3,498 184 56 1,057 Validation 467 466 133 133 Test 2,255 2,255 642 642) |
| Hardware Specification | Yes | All experiments are executed on NVIDIA RTX A4500. |
| Software Dependencies | No | The paper mentions 'DIT package (James et al., 2018)' but does not specify a version number for this or any other software component. |
| Experiment Setup | Yes | The hyperparameters are as follows: a batch size of 64, a learning rate of 0.001, a Cosine Annealing LR scheduler, an Adam optimizer with a weight decay of 0.0001, 50 pretraining epochs, followed by 100 epochs of additional training. When fine-tuning Res Net-50 we use the following hyperparameters: batch size of 64, learning rate of 0.0001, Cosine Annealing LR scheduler, stochastic gradient descent (SGD) optimizer with a weight decay of 0.0001, binary cross-entropy as the loss function, and 100 epochs. |