MetaOOD: Automatic Selection of OOD Detection Models

Authors: Yuehan Qin, Yichi Zhang, Yi Nian, Xueying Ding, Yue Zhao

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive experimentation with 24 unique test dataset pairs to choose from among 11 OOD detection models, we demonstrate that Meta OOD significantly outperforms existing methods and only brings marginal time overhead. Our results, validated by Wilcoxon statistical tests, show that Meta OOD surpasses a diverse group of 11 baselines, including established OOD detectors and advanced unsupervised selection methods.
Researcher Affiliation Academia 1University of Southern California 2University of Chicago 3Carnegie Mellon University EMAIL, EMAIL, EMAIL
Pseudocode Yes A.1 PSEUDO-CODE FOR META-TRAIN AND ONLINE MODEL SELECTION We discussed meta-training and online model selection in 3.3 and 3.4, respectively. Here is the pseudo-code for the two phases. Algorithm 1 Offline OOD detection meta-learner training Algorithm 2 Online OOD detection model selection
Open Source Code Yes Accessibility and Reproducibility. We release the testbed, corresponding code, and the proposed meta-learner at https://github.com/yqin43/metaood.
Open Datasets Yes ID Datasets: CIFAR10 (Krizhevsky, 2009), CIFAR100 (Krizhevsky, 2009), Image Net (Deng et al., 2009), Fashion MNIST (Xiao et al., 2017) Classic OOD Group: CIFAR10, CIFAR100, MNIST (Deng, 2012), Places365 (Zhou et al., 2018), SVHN (Netzer et al., 2011), Textures (Cimpoi et al., 2014), TIN (Le & Yang, 2015) Large-Scale OOD Group: SSB hard (Vaze et al., 2022), NINCO (Bitterwolf et al., 2023), i Naturalist (Horn et al., 2017), Textures (Cimpoi et al., 2014), Open Image-O (Wang et al., 2022)
Dataset Splits Yes We utilize the train-test split of datasets preprocessed as described in (Yang et al., 2022). To summarize, we create our ID-OOD dataset pairs using the following datasets: 1. ID Datasets: CIFAR10 (Krizhevsky, 2009), CIFAR100 (Krizhevsky, 2009), Image Net (Deng et al., 2009), Fashion MNIST (Xiao et al., 2017) ... We construct the ID-OOD dataset pair, and set the training and testing set as follows: (i) Training: CIFAR10 from ID and OOD from the classic OOD group shown above; and (ii) Testing: CIFAR100, Image Net, and Fashion MNIST from ID, and OOD from large-scale OOD dataset group.
Hardware Specification Yes Hardware. For consistency, all models are built using the pytorch-ood library (Kirchheim et al., 2022) on NVIDIA RTX 6000 Ada, 48 GB RAM workstations.
Software Dependencies No The paper mentions 'pytorch-ood library' and 'XGBoost model' and 'BERT-based all-mpnet-base-v2 model by Hugging Face' but does not provide specific version numbers for these software components or libraries, which is required for reproducibility.
Experiment Setup Yes Section B.1 PROMPTS TO LLM FOR ZERO-SHOT SELECTION OF THE OPTIMAL OOD DETECTOR: ...To ensure consistency, we set temperature parameter to 0, and top p parameter to 0.999.