Forest-of-Thought: Scaling Test-Time Compute for Enhancing LLM Reasoning

Authors: Zhenni Bi, Kai Han, Chuanjian Liu, Yehui Tang, Yunhe Wang

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate that the Fo T framework, combined with these strategies, significantly enhances the reasoning capabilities of LLMs, enabling them to solve complex tasks with greater precision and efficiency. We evaluate the proposed Fo T method on the widely-used LLM reasoning benchmarks including Game of 24, GSM8K and MATH.
Researcher Affiliation Industry 1Huawei Noah s Ark Lab. Correspondence to: Yehui Tang <EMAIL>, Yunhe Wang <Yunhe EMAIL>.
Pseudocode Yes Algorithm 1 Forest of Tree (Fo T) Require: Input x, LLM pθ, n reasoning trees {Ti()}, i = 1, 2, , n;
Open Source Code No Code will be available at https://github.com/iamhankai/Forest-of Thought.
Open Datasets Yes We evaluate the proposed Fo T method on the widely-used LLM reasoning benchmarks including Game of 24, GSM8K and MATH. For the Game of 24 (Yao et al., 2024), our Fo T is built using To T as the reasoning tree. In addition to the To T-based Fo T, we developed an MCTSr-based Fo T to address mathematical problems, including those from the GSM8K (Cobbe et al., 2021a) and MATH (Hendrycks et al., 2021b) benchmarks.
Dataset Splits Yes We removed the duplicate and unsolvable problems, leaving 95 problems as the test set.
Hardware Specification Yes We also gratefully acknowledge the support provided by Mind Spore, CANN (Compute Architecture for Neural Networks), and the Ascend AI Processor used in this research.
Software Dependencies No We also gratefully acknowledge the support provided by Mind Spore, CANN (Compute Architecture for Neural Networks), and the Ascend AI Processor used in this research.
Experiment Setup Yes In our experiment, we set the sampling temperature to 0.95.