Out-of-Distribution Detection with Prototypical Outlier Proxy
Authors: Mingrong Gong, Chaoqi Chen, Qingqiang Sun, Yue Wang, Hui Huang
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments across various benchmarks demonstrate the effectiveness of POP. Notably, POP achieves average FPR95 reductions of 7.70%, 6.30%, and 5.42% over the second-best methods on CIFAR-10, CIFAR100, and Image Net-200, respectively. Moreover, compared to the recent method NPOS, which relies on outlier synthesis, POP trains 7.2 times faster and performs inference 19.5 times faster. |
| Researcher Affiliation | Academia | 1College of Computer Science and Software Engineering, Shenzhen University 2School of Engineering, Great Bay University 3Department of Computer Science, University College London EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1: The algorithm of POP |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code or a link to a code repository. |
| Open Datasets | Yes | Datasets. For comprehensive experiments, we adopt the Open OOD benchmark (Yang et al. 2022a; Zhang et al. 2023c), which provides an accurate, standardized, and unified evaluation for fair testing. We include small-scale datasets CIFAR-10 (Krizhevsky, Hinton et al. 2009) and CIFAR100 (Krizhevsky, Hinton et al. 2009), and the large-scale Image Net-200, which is a subset of Image Net-1k (Deng et al. 2009) with the first 200 classes, as our ID datasets. Among them, (i) CIFAR-10 is a small dataset with 10 classes, including 50k training images and 10k test images. We establish OOD test dataset with CIFAR-100, Tiny Image Net (TIN) (Torralba, Fergus, and Freeman 2008), MNIST (Xiao, Rasul, and Vollgraf 2017) (including Fashion MNIST (Deng 2012)), Texture(Cimpoi et al. 2014), and Places365 (Zhou et al. 2016). (ii) CIFAR-100, another small dataset, consists of 50k training images and 10k test images, with 100 classes. The OOD test dataset includes CIFAR-10, with the remaining datasets configured identically to those in (i). (iii) For the large-scale dataset Image Net-200, the OOD test dataset consist of SSB (Vaze et al. 2022) NINCO (Bitterwolf, M uller, and Hein 2023), i Natruelist (Van Horn et al. 2018), Place365, and Open Image-O (Wang et al. 2022). |
| Dataset Splits | Yes | (i) CIFAR-10 is a small dataset with 10 classes, including 50k training images and 10k test images. (ii) CIFAR-100, another small dataset, consists of 50k training images and 10k test images, with 100 classes. |
| Hardware Specification | Yes | Training details. We train a Res Net-18 model (He et al. 2016) from scratch for 100 epochs on CIFAR-10 and CIFAR100, and 90 epochs on Image Net-200, using a single Nvidia 4090. |
| Software Dependencies | No | The paper mentions models like ResNet-18 and optimizers like SGD, but does not specify any software versions for libraries (e.g., PyTorch, TensorFlow) or programming languages. |
| Experiment Setup | Yes | Training details. We train a Res Net-18 model (He et al. 2016) from scratch for 100 epochs on CIFAR-10 and CIFAR100, and 90 epochs on Image Net-200, using a single Nvidia 4090. Training is performed with the SGD optimizer, a learning rate of 0.1, momentum of 0.9, and weight decay of 0.0005. |