A Survey on Out-of-Distribution Detection in NLP
Authors: Hao Lang, Yinhe Zheng, Yixuan Li, Jian SUN, Fei Huang, Yongbin Li
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this survey, we provide a comprehensive review of OOD detection methods in NLP. We formalize the OOD detection tasks and identify the major challenges of OOD detection in NLP. A taxonomy of existing OOD detection methods is also provided. We hope this survey helps researchers locate their target problems and find the most suitable datasets, metrics, and baselines. Moreover, we also provide some promising directions that can inspire future research and exploration. Finally, we do not present any new empirical results. It would be helpful to perform comparative experiments over different OOD detection methods (Yang et al., 2022). We leave this as future work. |
| Researcher Affiliation | Collaboration | Hao Lang EMAIL Alibaba Group, Yinhe Zheng EMAIL Alibaba Group, Yixuan Li EMAIL Department of Computer Sciences, University of Wisconsin-Madison, Jian Sun EMAIL Alibaba Group, Fei Huang EMAIL Alibaba Group, Yongbin Li EMAIL Alibaba Group |
| Pseudocode | No | The paper describes methodologies in prose and through a taxonomy diagram, but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper is a survey and does not present new empirical results or a novel methodology that would typically involve a dedicated code release. There are no explicit statements about releasing code or links to a code repository for the work described in this paper. |
| Open Datasets | Yes | CLINIC150 (Larson et al., 2019), Banking (Casanueva et al., 2020), Stack Overflow (Xu et al., 2015), STAR (Mosig et al., 2020), ROSTD (Gangal et al., 2020) are mentioned and cited in Appendix B, providing specific references to publicly available datasets that are commonly used in the field. These are standard academic datasets with proper attribution. |
| Dataset Splits | No | The paper is a survey and does not conduct new experiments requiring specific dataset splits for reproduction. It mentions various datasets and their characteristics but does not provide split information for its own work. |
| Hardware Specification | No | The paper is a survey and explicitly states, 'Finally, we do not present any new empirical results.' Therefore, no hardware specifications for running experiments are provided. |
| Software Dependencies | No | The paper is a survey and does not implement a new methodology. It discusses various existing software and models but does not specify software dependencies with version numbers for its own contribution. |
| Experiment Setup | No | The paper is a survey and explicitly states, 'Finally, we do not present any new empirical results.' Therefore, no experimental setup details like hyperparameters or training configurations are provided for the work described in this paper. |