reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

A Survey on the Honesty of Large Language Models

Authors: Siheng Li, Cheng Yang, Taiqiang Wu, Chufan Shi, Yuji Zhang, Xinyu Zhu, Zesen Cheng, Deng Cai, Mo Yu, Lemao Liu, Jie Zhou, Yujiu Yang, Ngai Wong, Xixin Wu, Wai Lam

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	To address the aforementioned challenges and promote further research on the honesty of LLMs, we provide an extensive overview of current studies in this area. Figure 1 shows the outline of this survey. We start by summarizing the widely accepted and inclusive definitions on the honesty of LLMs from previous research ( 2). Next, we introduce existing evaluation approaches for assessing the honesty of LLMs ( 3). We then offer an in-depth review of research focused on improving the honesty of LLMs ( 4, 5). Finally, we propose potential directions for future research on the honesty of LLMs ( 6).
Researcher Affiliation	Collaboration	1The Chinese University of Hong Kong 2The University of Hong Kong 3Tsinghua University 4University of Illinois at Urbana-Champaign 5University of Virginia 6Peking University 7We Chat AI
Pseudocode	No	The paper is a survey and outlines concepts, definitions, evaluation approaches, and improvement strategies for LLM honesty. It does not present any novel algorithms or procedures in pseudocode blocks or clearly labeled algorithm sections.
Open Source Code	No	The paper states: "We will constantly update the related research at https://github.com/Siheng Li99/LLM-Honesty-Survey." This link is for updating related research for the survey itself, not for providing source code of a specific experimental methodology developed within this paper.
Open Datasets	Yes	Representative benchmarks in this approach include Self Aware (Yin et al., 2023), KUQ (Amayuelas et al., 2023), Unknown Bench (Liu et al., 2024a), Hone Set (Gao et al., 2024) and Be Honest (Chern et al., 2024). These benchmarks generally assume that the model s pre-training corpus forms its knowledge base. For example, Yin et al. (2023) consider Wikipedia as part of the model s known knowledge as it is often included in pretraining data. Therefore, questions sourced from Wikipedia, such as SQu AD (Rajpurkar et al., 2016), can be treated as known questions.
Dataset Splits	No	The paper is a survey of existing research and does not conduct its own experiments or define new datasets or dataset splits for reproduction. It references various existing datasets and benchmarks but does not specify how they should be split for new experiments.
Hardware Specification	No	The paper is a survey and does not report on any new experimental results or specify the hardware used for any experiments conducted by the authors.
Software Dependencies	No	The paper is a survey and does not describe a specific experimental setup requiring software dependencies with version numbers.
Experiment Setup	No	The paper is a survey and does not include details of an experimental setup, such as hyperparameters or training configurations, as it does not present new experimental results.