A Survey on Out-of-Distribution Detection in NLP

Authors: Hao Lang, Yinhe Zheng, Yixuan Li, Jian SUN, Fei Huang, Yongbin Li

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this survey, we provide a comprehensive review of OOD detection methods in NLP. We formalize the OOD detection tasks and identify the major challenges of OOD detection in NLP. A taxonomy of existing OOD detection methods is also provided. We hope this survey helps researchers locate their target problems and find the most suitable datasets, metrics, and baselines. Moreover, we also provide some promising directions that can inspire future research and exploration. Finally, we do not present any new empirical results. It would be helpful to perform comparative experiments over different OOD detection methods (Yang et al., 2022). We leave this as future work.
Researcher Affiliation Collaboration Hao Lang EMAIL Alibaba Group, Yinhe Zheng EMAIL Alibaba Group, Yixuan Li EMAIL Department of Computer Sciences, University of Wisconsin-Madison, Jian Sun EMAIL Alibaba Group, Fei Huang EMAIL Alibaba Group, Yongbin Li EMAIL Alibaba Group
Pseudocode No The paper describes methodologies in prose and through a taxonomy diagram, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper is a survey and does not present new empirical results or a novel methodology that would typically involve a dedicated code release. There are no explicit statements about releasing code or links to a code repository for the work described in this paper.
Open Datasets Yes CLINIC150 (Larson et al., 2019), Banking (Casanueva et al., 2020), Stack Overflow (Xu et al., 2015), STAR (Mosig et al., 2020), ROSTD (Gangal et al., 2020) are mentioned and cited in Appendix B, providing specific references to publicly available datasets that are commonly used in the field. These are standard academic datasets with proper attribution.
Dataset Splits No The paper is a survey and does not conduct new experiments requiring specific dataset splits for reproduction. It mentions various datasets and their characteristics but does not provide split information for its own work.
Hardware Specification No The paper is a survey and explicitly states, 'Finally, we do not present any new empirical results.' Therefore, no hardware specifications for running experiments are provided.
Software Dependencies No The paper is a survey and does not implement a new methodology. It discusses various existing software and models but does not specify software dependencies with version numbers for its own contribution.
Experiment Setup No The paper is a survey and explicitly states, 'Finally, we do not present any new empirical results.' Therefore, no experimental setup details like hyperparameters or training configurations are provided for the work described in this paper.