ExCeL: Combined Extreme and Collective Logit Information for Out-of-Distribution Detection
Authors: Naveen Karunanayake, Suranga Seneviratne, Sanjay Chawla
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments conducted on CIFAR100, Image Net-200, and Image Net-1K datasets demonstrate that Ex Ce L consistently is among the five top-performing methods out of twenty-one existing post-hoc baselines when the joint performance on near-OOD and far-OOD is considered (i.e., in terms of AUROC and FPR95). |
| Researcher Affiliation | Academia | Naveen Karunanayake EMAIL School of Computer Science The University of Sydney Suranga Seneviratne EMAIL School of Computer Science The University of Sydney Sanjay Chawla EMAIL Qatar Computing Research Institute, HBKU |
| Pseudocode | No | The paper describes the Ex Ce L score computation in four steps with mathematical equations (Equation 2, 4, 5, 7, 8) and prose, but does not present a dedicated, clearly labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | No | We intend to include Ex Ce L in the Open OOD benchmark so that our results can be reproduced and compared with future work. |
| Open Datasets | Yes | We use CIFAR100 (Krizhevsky et al., 2009), Image Net-200 (a.k.a., Tiny Image Net) (Le & Yang, 2015), and Image Net-1K (Deng et al., 2009) as ID data in our experiments. ... For CIFAR100, CIFAR10 and Tiny Image Net datasets serve as near-OOD, while MNIST (Deng, 2012), SVHN (Netzer et al., 2011), Textures (Cimpoi et al., 2014), and Places365 (Zhou et al., 2017) are considered as far-OOD. Similarly, for both Tiny Image Net and Image Net-1K, SSB-hard (Vaze et al., 2021) and NINCO (Bitterwolf et al., 2023) datasets are used as near-OOD, while i Naturalist (Van Horn et al., 2018), Textures (Cimpoi et al., 2014), and Open Image-O (Wang et al., 2022) datasets are used as far-OOD. |
| Dataset Splits | Yes | For consistency, we adopt the same train, validation, and test splits used by the Open OOD benchmark in implementing our method. ... We use the validation set to determine these hyperparameters. ... We fine-tune α using a validation set following the same approach used in Section 4.2.2 to fine-tune a and b. |
| Hardware Specification | No | The paper mentions training models (Res Net-18, Res Net-50) but does not provide specific details about the hardware used for training or inference, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper mentions using 'Res Net-18', 'Res Net-50', 'SGD optimiser', and 'Torch Vision (maintainers & contributors, 2016)'. However, it does not provide specific version numbers for software components like Python, PyTorch, or any other libraries used. |
| Experiment Setup | Yes | Each model is trained for 100 epochs using the standard cross-entropy loss. We use the SGD optimiser with a momentum of 0.9, a learning rate of 0.1, and a cosine annealing decay schedule (Loshchilov & Hutter, 2016). Furthermore, we incorporate a weight decay of 0.0005, and employ batch sizes of 128 and 256 for CIFAR100 and Image Net-200, respectively. ... By performing a grid search on a, b, and α, we discovered the best hyperparameter combination for both CIFAR100 and Image Net-200 datasets, is a = 10, b = 5, and α = 0.8. For Image Net-1K, they were a = 10, b = 5, and α = 0.6. |