reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Energy-based Out-of-distribution Detection

Authors: Weitang Liu, Xiaoyun Wang, John Owens, Yixuan Li

NeurIPS 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we describe our experimental setup (Section 4.1) and demonstrate the effectiveness of our method on a wide range of OOD evaluation benchmarks. We also conduct an ablation analysis that leads to an improved understanding of our approach (Section 4.2).
Researcher Affiliation	Academia	Weitang Liu Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 92093, USA EMAIL Xiaoyun Wang Department of Computer Science University of California, Davis Davis, CA 95616, USA EMAIL John D. Owens Department of Electrical and Computer Engineering University of California, Davis Davis, CA 95616, USA EMAIL Yixuan Li Department of Computer Sciences University of Wisconsin-Madison Madison, WI 53703, USA EMAIL
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	1Our code is publicly available to facilitate reproducible research: https://github.com/wetliu/ energy_ood.
Open Datasets	Yes	In-distribution Datasets We use the SVHN [28], CIFAR-10 [18], and CIFAR-100 [18] datasets as in-distribution data.
Dataset Splits	Yes	We use the standard split, and denote the training and test set by Dtrain in and Dtest in , respectively. ... We use the validation set as in Hendrycks et al. [14] to determine the hyperparameters: min is chosen from { 3, 5, 7}, and mout is chosen from { 15, 19, 23, 27} that minimize FPR95.
Hardware Specification	Yes	The research at UC Davis was supported by an NVIDIA gift and their donation of a DGX Station.
Software Dependencies	No	The paper mentions software like 'Wide Res Net' but does not specify version numbers for any software components or libraries.
Experiment Setup	Yes	For energy ﬁne-tuning, the weight λ of Lenergy is 0.1. We use the same training setting as in Hendryks et al. [14], where the number of epochs is 10, the initial learning rate is 0.001 with cosine decay [24], and the batch size is 128 for in-distribution data and 256 for unlabeled OOD training data. We use the validation set as in Hendrycks et al. [14] to determine the hyperparameters: min is chosen from { 3, 5, 7}, and mout is chosen from { 15, 19, 23, 27} that minimize FPR95.