Clustering-Based Validation Splits for Model Selection under Domain Shift
Authors: Andrea Napoli, Paul White
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In experiments, the technique consistently outperforms alternative splitting strategies across a range of datasets and training algorithms, for both domain generalisation and unsupervised domain adaptation tasks. Analysis also shows the MMD between the training and validation sets to be well-correlated with test domain accuracy, further substantiating the validity of this approach. ... 5 Experiments |
| Researcher Affiliation | Academia | Andrea Napoli & Paul White EMAIL Institute of Sound and Vibration Research University of Southampton, UK |
| Pseudocode | Yes | Algorithm 1 Constrained kernel k-means clustering |
| Open Source Code | No | The paper does not explicitly provide a link to the source code for the described methodology, nor does it contain a clear statement about its release or availability in supplementary materials. The Open Review link provided is for a review forum, not a code repository. |
| Open Datasets | Yes | Camelyon17-WILDS (Bándi et al., 2019; Koh et al., 2021) tumour detection in tissue samples across 5 hospitals... License: CC0. ... SVIRO (Dias Da Cruz et al., 2020) classification of vehicle rear seat occupancy... License: CC BY-NC-SA 4.0. ... Terra Incognita (Beery et al., 2018) classification of wild animals... License: CDLA-Permissive 1.0. |
| Dataset Splits | Yes | S must be partitioned into training and validation sets, T and V respectively. ... T and V should be of sizes determined by a user-defined holdout fraction h satisfying 0 < h < 1... Table 6: Holdout fraction 0.2 UDA holdout fraction 0.5. ... Every domain is tested 3 times for reproducibility, each time with a different random seed for model initialisation, hyperparameter search and other stochastic variables. The reported accuracy values are averages over all domains and repeats. |
| Hardware Specification | No | The authors acknowledge the use of the IRIDIS High Performance Computing Facility, and associated support services at the University of Southampton, in the completion of this work. ... In total, the experiments involve training 5,160 models, requiring around 100 GPU-days of computation. These statements indicate the use of computing resources but lack specific hardware details such as GPU/CPU models. |
| Software Dependencies | Yes | Experiments are conducted using the Domain Bed framework (Gulrajani & Lopez Paz, 2021). This means all-but-one of the domains are placed in the development set... The Gurobi Optimizer (Gurobi Optimization LLC, 2023) is used to solve the LPs. |
| Experiment Setup | Yes | Table 6: General parameter values and training details for the experiments. Experimental parameter Value Hyperparameter random search size 10 Number of trials 3 Holdout fraction 0.2 UDA holdout fraction 0.5 Number of training steps 3000 Gaussian kernel bandwidth 1 Finetuning iterations before split 3000 Nyström subset size (if applicable) 2000 Architecture Res Net-18 Class balanced True |