ResearchTown: Simulator of Human Research Community

Authors: Haofei Yu, Zhaochen Hong, Zirui Cheng, Kunlun Zhu, Keyang Xuan, Jinwei Yao, Tao Feng, Jiaxuan You

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments reveal three key findings: (1) RESEARCHTOWN can provide a realistic simulation of collaborative research activities, including paper writing and review writing; (2) RESEARCHTOWN can maintain robust simulation with multiple researchers and diverse papers; (3) RESEARCHTOWN can generate interdisciplinary research ideas that potentially inspire pioneering research directions.
Researcher Affiliation Academia 1University of Illinois Urbana-Champaign. Correspondence to: Haofei Yu <EMAIL>, Jiaxuan You <EMAIL>.
Pseudocode Yes Algorithm 1 RESEARCHTOWN simulation algorithm
Open Source Code Yes Code: https://github.com/ulab-uiuc/research-town
Open Datasets Yes Data: https://huggingface.co/datasets/ulab-ai/research-bench
Dataset Splits Yes To allow more fine-grained analysis, we split these 1,000 paper-writing tasks into three subgroups based on their difficulty level. We use the data-agg settings described in Section 7 to obtain results and compute similarity scores for our simulations. We then divide the dataset into three equal subsets: the worst 333 data points (hard), the middle 334 data points (medium), and the top 333 data points (easy). This results in a more granular categorization of the dataset s difficulty.
Hardware Specification No The paper mentions using LLMs like GPT-4o-mini, Qwen-2.5-7B-Instruct, and Deepseek-v3, and embedding models like text-embedding-large-3 and voyage-3, often via APIs (e.g., 'accessed via the Open AI API', 'via the together.ai inference API'). However, it does not specify the underlying hardware (e.g., CPU, GPU models, memory) on which these experiments were run or the APIs operate.
Software Dependencies Yes We utilize GPT-4o-mini-2024-07-18 accessed via the Open AI API. We use Qwen-2.5-7B-Instruct-Turbo and Deepseek-v3-0324 via the together.ai 1 inference API. We utilize the official inference API provided by Open AI and Voyage AI to use text-embedding-3-large and voyage-3 separately.
Experiment Setup Yes We utilize GPT-4o-mini 2 as the LLM backbone for implementing the agent functions, with the decoding temperature set to 0 to ensure reproducibility.