Out-of-Distribution Detection with Prototypical Outlier Proxy

Authors: Mingrong Gong, Chaoqi Chen, Qingqiang Sun, Yue Wang, Hui Huang

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments across various benchmarks demonstrate the effectiveness of POP. Notably, POP achieves average FPR95 reductions of 7.70%, 6.30%, and 5.42% over the second-best methods on CIFAR-10, CIFAR100, and Image Net-200, respectively. Moreover, compared to the recent method NPOS, which relies on outlier synthesis, POP trains 7.2 times faster and performs inference 19.5 times faster.
Researcher Affiliation Academia 1College of Computer Science and Software Engineering, Shenzhen University 2School of Engineering, Great Bay University 3Department of Computer Science, University College London EMAIL, EMAIL, EMAIL
Pseudocode Yes Algorithm 1: The algorithm of POP
Open Source Code No The paper does not contain any explicit statement about releasing source code or a link to a code repository.
Open Datasets Yes Datasets. For comprehensive experiments, we adopt the Open OOD benchmark (Yang et al. 2022a; Zhang et al. 2023c), which provides an accurate, standardized, and unified evaluation for fair testing. We include small-scale datasets CIFAR-10 (Krizhevsky, Hinton et al. 2009) and CIFAR100 (Krizhevsky, Hinton et al. 2009), and the large-scale Image Net-200, which is a subset of Image Net-1k (Deng et al. 2009) with the first 200 classes, as our ID datasets. Among them, (i) CIFAR-10 is a small dataset with 10 classes, including 50k training images and 10k test images. We establish OOD test dataset with CIFAR-100, Tiny Image Net (TIN) (Torralba, Fergus, and Freeman 2008), MNIST (Xiao, Rasul, and Vollgraf 2017) (including Fashion MNIST (Deng 2012)), Texture(Cimpoi et al. 2014), and Places365 (Zhou et al. 2016). (ii) CIFAR-100, another small dataset, consists of 50k training images and 10k test images, with 100 classes. The OOD test dataset includes CIFAR-10, with the remaining datasets configured identically to those in (i). (iii) For the large-scale dataset Image Net-200, the OOD test dataset consist of SSB (Vaze et al. 2022) NINCO (Bitterwolf, M uller, and Hein 2023), i Natruelist (Van Horn et al. 2018), Place365, and Open Image-O (Wang et al. 2022).
Dataset Splits Yes (i) CIFAR-10 is a small dataset with 10 classes, including 50k training images and 10k test images. (ii) CIFAR-100, another small dataset, consists of 50k training images and 10k test images, with 100 classes.
Hardware Specification Yes Training details. We train a Res Net-18 model (He et al. 2016) from scratch for 100 epochs on CIFAR-10 and CIFAR100, and 90 epochs on Image Net-200, using a single Nvidia 4090.
Software Dependencies No The paper mentions models like ResNet-18 and optimizers like SGD, but does not specify any software versions for libraries (e.g., PyTorch, TensorFlow) or programming languages.
Experiment Setup Yes Training details. We train a Res Net-18 model (He et al. 2016) from scratch for 100 epochs on CIFAR-10 and CIFAR100, and 90 epochs on Image Net-200, using a single Nvidia 4090. Training is performed with the SGD optimizer, a learning rate of 0.1, momentum of 0.9, and weight decay of 0.0005.