reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

Authors: Yang Sui, Yu-Neng Chuang, Guanchu Wang, Jiamu Zhang, Tianyi Zhang, Jiayi Yuan, Hongyi Liu, Andrew Wen, Shaochen Zhong, Na Zou, Hanjie Chen, Xia Hu

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this paper, we provide the first structured survey to systematically investigate and explore the current progress toward achieving efficient reasoning in LLMs. Overall, relying on the inherent mechanism of LLMs, we categorize existing works into several key directions: (1) model-based efficient reasoning, which considers optimizing full-length reasoning models into more concise reasoning models or directly training efficient reasoning models; (2) reasoning output-based efficient reasoning, which aims to dynamically reduce reasoning steps and length during inference; (3) input prompts-based efficient reasoning, which seeks to enhance reasoning efficiency based on input prompt properties such as difficulty or length control. Additionally, we introduce the use of efficient data for training reasoning models, explore the reasoning capabilities of small language models, and discuss evaluation methods and benchmarking.
Researcher Affiliation	Academia	Yang Sui1 Yu-Neng Chuang1 Guanchu Wang1 Jiamu Zhang1 Tianyi Zhang1 Jiayi Yuan1 Hongyi Liu1 Andrew Wen1 Shaochen Zhong1 Na Zou2 Hanjie Chen1 Xia Hu1 1Rice University 2University of Houston
Pseudocode	No	The paper describes various methods and approaches to efficient reasoning in LLMs, such as RL Optimization via Length Reward, SFT with Variable Length CoT, Latent Representation Compression, and Dynamic Reasoning Paradigms. These methods are explained conceptually and through comparative tables of existing literature, but no structured pseudocode or algorithm blocks are provided.
Open Source Code	No	We maintain a public repository to continuously track and update the latest research in this promising area: https://github.com/Eclipsess/Awesome-Efficient-Reasoning-LLMs. This repository tracks research in the area, but does not provide source code for the survey methodology itself.
Open Datasets	Yes	Table 1: Comparison of different length reward-based RL methods. ... GSM8K Gao Kao MATH-500 ... MATH-500 AIME-2024 Theorem QA MMLU-Pro-1k ... AMC GPQA LAST MMLU MATH-500 AIME-2024 Olympiad-Bench ... Trained with constructed length preference data MATH-500 AIME-2024 ... GSM8K MATH-500 AIME-2024
Dataset Splits	No	This paper is a survey that reviews existing literature on efficient reasoning for LLMs. It describes various methods and datasets used in the works it surveys (e.g., GSM8K, MATH), but it does not present its own experimental results or specify dataset splits for any methodology described within this survey paper.
Hardware Specification	No	This paper is a survey of existing research on efficient reasoning for LLMs. It does not describe its own experimental setup or provide hardware specifications for any experiments conducted by the authors of this survey.
Software Dependencies	No	This paper is a survey and does not present new experimental results or methodology that would require specific software dependencies. Therefore, no software dependencies with version numbers are listed for the work described in this paper.
Experiment Setup	No	This paper is a survey of existing research on efficient reasoning for LLMs. It summarizes and categorizes findings from other papers but does not conduct its own experiments or provide specific experimental setup details like hyperparameters or training configurations for a methodology developed in this paper.