reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Mining In-distribution Attributes in Outliers for Out-of-distribution Detection

Authors: Yutian Lei, Luping Ji, Pei Liu

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate the superiority of our framework to others. MVOL effectively utilizes both auxiliary OOD datasets and even wild datasets with noisy ID data.
Researcher Affiliation	Academia	School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
Pseudocode	No	The paper includes definitions, propositions, assumptions, and theorems, along with mathematical equations, but it does not contain any clearly labeled pseudocode blocks or algorithms.
Open Source Code	Yes	Code https://github.com/UESTC-nnLab/MVOL
Open Datasets	Yes	Datasets. (1) Following the common benchmarks in literature, we use CIFAR-10 and CIFAR-100 as ID datasets, and six diverse OOD test sets. The Tiny Images dataset used in (Hendrycks, Mazeika, and Dietterich 2019; Liu et al. 2020) for auxiliary outliers has been withdrawn due to its ethical wrong. Instead, we use 300K Random Images, a cleaned subset of the Tiny Images dataset by (Hendrycks, Mazeika, and Dietterich 2019), as done in (Katz-Samuels et al. 2022).
Dataset Splits	Yes	(2) For experiments with wild datasets, we partition CIFAR10 into 2 halves: one provides ID training data, and the other provides auxiliary noisy ID data. CIFAR-100 provides auxiliary outliers. The auxiliary ID and OOD data are mixed with varied noise levels α, i.e., 0, 0.05, 0.1, 0.3, 0.5, and the total images in auxiliary datasets are maintained at 50000.
Hardware Specification	No	The paper does not specify any particular hardware used for running the experiments, such as GPU models, CPU models, or memory.
Software Dependencies	No	The paper mentions using 'Wide Res Net' and 'cross entropy loss' but does not specify any software libraries or their version numbers.
Experiment Setup	Yes	Training Details. (1) We use the Wide Res Net with 40 layers and a widen factor of 2 for all experiments. (2) Involved two experimental settings: (i) Single model: single neural network is trained with one-hot labels and cross entropy loss. (ii) Ensemble distillation model: a single network is trained with the standard ensemble distillation algorithm in (Hinton, Vinyals, and Dean 2015), where soft labels are generated by ensemble models.