Outlier Synthesis via Hamiltonian Monte Carlo for Out-of-Distribution Detection

Authors: Hengzhuang Li, Teng Zhang

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental By empirically competing with SOTA baselines on both standard and large-scale benchmarks, we verify the efficacy and efficiency of our proposed Ham OS. Our code is available at: https://github.com/Fir-lat/Ham OS_OOD. ... We conduct extensive empirical analysis to demonstrate the state-of-the-art (SOTA) performance of Ham OS.
Researcher Affiliation Academia Hengzhuang Li, Teng Zhang School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China EMAIL
Pseudocode Yes The pseudo-code is provided in Algorithm 1, with the whole training pipeline displayed in Algorithm 2 in Appendix D.
Open Source Code Yes Our code is available at: https://github.com/Fir-lat/Ham OS_OOD.
Open Datasets Yes Following the common practice for benchmarking the OOD detection (Zhang et al., 2023b), we use CIFAR-10, CIFAR-100 (Krizhevsky & Hinton, 2009) and Image Net-1K (Deng et al., 2009) as ID datasets, and adopt a series of datasets as OOD testing data. For CIFAR ID datasets, we use MNIST (Deng, 2012), SVHN (Netzer et al., 2011), Textures (Cimpoi et al., 2014), Places365 (Zhou et al., 2017), and LSUN (Yu et al., 2015) as OOD testing data; for Image Net-1K, we use i Natualist (Van Horn et al., 2018), Textures (Cimpoi et al., 2014), SUN (Xiao et al., 2010), and Places365 (L opez-Cifuentes et al., 2020) as OOD testing data.
Dataset Splits No The paper mentions using 'ID training dataset' and 'OOD test datasets' and refers to common practice for benchmarking OOD detection. However, it does not explicitly state the specific split percentages (e.g., 80/10/10) or sample counts for training, validation, and test sets for any of the mentioned datasets within its own text.
Hardware Specification Yes All experiments in this paper are conducted for multiple runs on a single NVIDIA Tesla V100 Tensor Core with 32GB memory using Python version 3.10.9.
Software Dependencies Yes Python version 3.10.9. The deep learning environment is established using Py Torch version 1.13.1 and Trochvision version 0.14.1 with CUDA 12.2 in the Ubuntu 18.04.6 system.
Experiment Setup Yes For fine-tuning a pretrained model, we set the training epochs to 20 following previous works (Ming et al., 2023b; Tao et al., 2023), with the size of mini-batch set to 128 for CIFAR-10/100 and 256 for Image Net-1K. For Image Net-1K, we freeze the first three layers of the pretrained model following Ming et al. (2023b) and use the cosine annealing as the learning rate schedule beginning at 0.0001. For memory efficiency, we set the size of the class-conditional ID buffer to 1000 for CIFAR10/100 and 100 for Image Net-1K. We summarize the default training configurations of Ham OS in Table 4. Table 4: Training Configurations of Ham OS: Training epochs 20, Learning rate 0.01, Momentum 0.9, Batch size 128, Weight decay 1.0 10 4, LR schedule Cos. Anneal, Prototype update factor 0.95, Buffer size of ID data 1000, Bandwidth κ 2.0, OOD-discernment weight λd 0.1, k for KNN distance 200, Hard margin δ 0.1, Leapfrog step L 3, Step size ϵ 0.1, Number of adjacent ID clusters Nadj 4, Synthesis rounds R 5.