A Closer Look at Generalized BH Algorithm for Out-of-Distribution Detection

Authors: Xinsong Ma, Jie Wu, Weiwei Liu

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, extensive experimental results validate the effectiveness of our theoretical findings and demonstrate the superiority of our method over g-BH algorithm on small calibrated set.
Researcher Affiliation Academia 1School of Computer Science, National Engineering Research Center for Multimedia Software, Institute of Artificial Intelligence and Hubei Key Laboratory of Multimedia and Network Communication Engineering, Wuhan University, Wuhan, China. Correspondence to: Weiwei Liu <EMAIL>.
Pseudocode Yes Algorithm 1 eg-BH algorithm
Open Source Code No We mainly follow the experimental implementation in (Yang et al., 2022; Zhang et al., 2023a), and our codes are based on (Zhang et al., 2023a).
Open Datasets Yes We use CIFAR-10 (Krizhevsky et al., 2009) as ID data, and use CIFAR-100, Image Net (Krizhevsky et al., 2017), SVHN (Netzer et al., 2011), Fashion-MNIST (F-MNIST) (Xiao et al., 2017), Places365 (Zhou et al., 2018) and MNIST (Deng, 2012), as OOD data.
Dataset Splits Yes We first split the training data equally into two parts. One part is employed to train the neural networks for constructing the score function, and the other serves as the largest calibrated set T cal M . Then, from T cal M , we extract samples at various proportions r to construct several relatively smaller calibrated sets, where r = {0.2, 0.3, ..., 1.0}.
Hardware Specification No The paper mentions using Res Net18 and Wide Res Net as models, but no specific hardware such as GPU or CPU models are mentioned for running experiments.
Software Dependencies No We mainly follow the experimental implementation in (Yang et al., 2022; Zhang et al., 2023a), and our codes are based on (Zhang et al., 2023a). However, specific versions of software dependencies like Python, PyTorch, or CUDA are not listed.
Experiment Setup Yes We choose two famous methods MSP(Hendrycks & Gimpel, 2017) and Energy(Liu et al., 2020) as the score functions in our method. Model. The score functions in this paper are based on the Res Net18 and Wide Res Net, respectively. To assess the impact of L on both the g-BH algorithm and eg-BH algorithm. we set L = {6, 7, 8, 9, 10} and conduct the corresponding experiments using Energy as score function based on the Res Net18. We first split the training data equally into two parts... from T cal M , we extract samples at various proportions r to construct several relatively smaller calibrated sets, where r = {0.2, 0.3, ..., 1.0}.