reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

GROOD: GRadient-Aware Out-of-Distribution Detection

Authors: Mostafa ElAraby, Sabyasachi Sahoo, Yann Pequignot, Paul Novello, Liam Paull

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental evaluations demonstrate that gradients computed from the OOD prototype enhance the distinction between ID and OOD data, surpassing established baselines in robustness, particularly on Image Net-1k. These findings highlight the potential of gradient-based methods and prototype-driven approaches in advancing OOD detection within deep neural networks.
Researcher Affiliation	Collaboration	Mostafa El Araby {elarabim}@mila.quebec DIRO, Université de Montréal Mila Quebec AI Institute Sabyasachi Sahoo Université Laval, IID Mila Quebec AI Institute Yann Pequignot Institut Intelligence et Données (IID) Université Laval Paul Novello DEEL, IRT Saint Exupéry Liam Paull DIRO, Université de Montréal Mila Quebec AI Institute CIFAR AI Chair
Pseudocode	Yes	A.1 GROOD Algorithm For a comprehensive overview, the complete GROOD algorithm is outlined in algorithm 1 and algorithm 2. Algorithm 1 GROOD initialization: Compute prototypes and gradients for the training set Require: Training set Din, trained model f Require: mixup parameter λ 1: Compute class prototypes Pearly and Ppen eq. (7) 2: Compute OOD prototype ppen ood using synthetic data generation 5.2 eq. (8) 3: function comp_grad(h, ppen y=1, ,C, ppen ood) 4: compute H(h) eq. (5) using ppen y=1, ,C, ppen ood 5: return H(h) 6: end function 7: for each x Din do 8: (x) = comp_grad (h(x), ppen y=1, ,C, ppen ood) 9: end for 10: return { (x)}x Din, pearly y=1, ,C, ppen y=1, ,C, ppen ood Algorithm 2 OOD score using GROOD Require: Training dataset Din, trained model f Require: { (x))}x Din, ppen y=1, ,C, ppen ood from GROOD initialization Require: function comp_grad (in Alg. 1) Require: Sample xnew, threshold τ 1: (xnew) = comp_grad(h(xnew), ppen y=1, ,C, ppen ood) 2: Compute OOD score using Nearest Neighbor search: S(xnew) = minx Din (xnew) (x) 2 3: return ID if S(xnew) τ else OOD
Open Source Code	Yes	To allow for reproducibility and facilitate further research, the complete code, including training and evaluation scripts, is available at: https://github.com/mostafaelaraby/Gradient-Aware-OOD-Detection.
Open Datasets	Yes	Our benchmarking strategy follows the Open OOD framework Zhang et al. (2023a); Yang et al. (2022), involving four core ID datasets (CIFAR-10, CIFAR-100, Image Net-200, Image Net-1k) and examining both near and far-OOD scenarios. For CIFAR-10/100 (50k train/10k test images each), near-OOD datasets are CIFAR-100/Tiny Image Net, and far-OOD are MNIST, SVHN, Textures, and Places365. For Image Net-200 (200 classes, 64x64 resolution), near-OOD datasets are SSB-hard/NINCO, and far-OOD are i Naturalist, Textures, and Open Image-O; Image Net-1k shares these OOD datasets.
Dataset Splits	Yes	Our benchmarking strategy follows the Open OOD framework Zhang et al. (2023a); Yang et al. (2022), involving four core ID datasets (CIFAR-10, CIFAR-100, Image Net-200, Image Net-1k) and examining both near and far-OOD scenarios. For CIFAR-10/100 (50k train/10k test images each), near-OOD datasets are CIFAR-100/Tiny Image Net, and far-OOD are MNIST, SVHN, Textures, and Places365. For Image Net-200 (200 classes, 64x64 resolution), near-OOD datasets are SSB-hard/NINCO, and far-OOD are i Naturalist, Textures, and Open Image-O; Image Net-1k shares these OOD datasets.
Hardware Specification	No	The authors acknowledge the financial support from DEEL s Industrial and Academic Members and the France 2030 program Grant agreements n ANR-10-AIRT-01 and n ANR-23-IACL-0002. We are grateful for the digital alliance of Canada and NVIDIA for compute infrastructure.
Software Dependencies	No	We address this challenge using the FAISS library Douze et al. (2024), which provides efficient approximate nearest neighbor search through inverted lists and quantization. This makes our method practical for large-scale applications while maintaining accuracy.
Experiment Setup	Yes	For CIFAR-10/100 and Image Net-200, we train Res Net-18 models for 100 epochs using SGD with momentum 0.9, weight decay 5e-4, cosine learning rate decay (starting from 0.1), and batch sizes of 128 (CIFAR) and 256 (Image Net-200). For Image Net-1K, we use pretrained models from torchvision (Res Net-50, Vi T, Swin). When fine-tuning is required, we follow the Open OOD v1.5 protocol (Zhang et al., 2023a) with 30 epochs, learning rate 0.001, and batch size 256.