reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

LoD: Loss-difference OOD Detection by Intentionally Label-Noisifying Unlabeled Wild Data

Authors: Chuanxing Geng, Qifei Li, Xinrui Wang, Dong Liang, Songcan Chen, Pong C. Yuen

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also provide theoretical foundation for Lo D s viability, and extensive experiments verify its superiority.
Researcher Affiliation	Academia	1College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics 2Department of Computer Science, Hong Kong Baptist University 3MIIT Key Laboratory of Pattern Analysis and Machine Intelligence EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1 Lo D OOD Detection Framework
Open Source Code	Yes	Our Lo D (https://github.com/Chuanxing Geng/Lo D) framework contains two main modules, i.e., loss-difference OOD filtering module and OOD detector learning module.
Open Datasets	Yes	For standard benchmarks, we here follow [Du et al., 2024; Katz-Samuels et al., 2022], and choose CIFAR100 as in-distribution (ID) datasets (Pin). For the outof-distribution (OOD) test datasets (Pout), we use a diverse collection of natural image datasets including SVHN [Netzer et al., 2011], Textures [Cimpoi et al., 2014], Places [Zhou et al., 2017], LSUN-Crop [Yu et al., 2015] and LSUN-Resize [Yu et al., 2015]. ... We here select CIFAR10, CIFAR+10, CIFAR+50, and Tiny Image Net [Vaze et al., 2022] to curate the hard OOD benchmarks, and more details can be found in Appendix B of supplementary materials.
Dataset Splits	Yes	Specifically, the ID dataset is split into two equal halves (25,000 images per half), with one half used to mix with an OOD dataset (e.g., SVHN) to create the unlabeled wild data (Pwild). ... the training set of 6 ID classes is divided into two halves (15,000 images per half). One half is used as labeled ID data, while the other half is mixed with the data from 4 OOD classes to create the unlabeled wild data.
Hardware Specification	Yes	All experiments are conducted on a single NVIDIA RTX 3090 GPU.
Software Dependencies	No	The paper mentions "Wide Res Net [Zagoruyko, 2016]" as the backbone and "stochastic gradient descent" as the optimizer, but it does not specify version numbers for any software libraries or frameworks like PyTorch, TensorFlow, CUDA, etc.
Experiment Setup	Yes	For these two modules, we follow [Du et al., 2024; Katz-Samuels et al., 2022] and employ Wide Res Net [Zagoruyko, 2016] with 40 layers and widen factor of 2 as the backbone. Moreover, for the loss-difference OOD filtering module, we use stochastic gradient descent with a momentum of 0.9 as the optimizer, and set the initial learning rate to 0.01. We train for 100 epochs using cosine learning rate decay, a batch size of 128 in which \|Btrain in \| : \|Bwild\| = 3 : 1 , and a dropout rate of 0.3. For the OOD detector learning module, similar to [Du et al., 2024], we load a pre-trained ID classifier and add an additional linear layer which utilize the penultimate-layer features of ID classifier for binary classification. The initial learning rate is set to 0.001, and the remaining training configurations are consistent with those of the former module.