POLO: An LLM-Powered Project-Level Code Performance Optimization Framework

Authors: Jiameng Bai, Ruoyi Xu, Sai Wu, Dingyu Yang, Junbo Zhao, Gang Chen

IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments on open-source and proprietary projects. The results demonstrate that POLO accurately identifies performance bottlenecks and successfully applies optimizations. Under the O3 compilation flag, the optimized programs achieved speedups ranging from 1.34x to 21.5x.
Researcher Affiliation Academia 1College of Computer Science and Technology, Zhejiang University 2Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security 3The State Key Laboratory of Blockchain and Data Security, Zhejiang University EMAIL
Pseudocode Yes The analysis process is detailed in the Algorithm 1 in Appendix A.
Open Source Code No No explicit statement about the release of POLO's source code or a link to a repository is provided in the paper.
Open Datasets Yes Quant [Smidt, 2012] C++ 5 1 9 315 Quantitative Trade Framework AStar (Private) C++ 10 9 18 484 A* search Algorithm Medium Skip List (Private) C++ 6 7 44 886 Skip List Implementation AES [Conte, 2015] C 3 0 31 1409 Crypto Algorithm Hard KGraph [DBAIWang Group, 2021] C++ 8 22 109 3937 Nearest Neighbor Search Mini SQL (Private) C++ 129 130 772 15915 Small Database System ... KGraph Audio [Group, 2006], Sift1M [Jegou et al., 2010] Mini SQL provided 83.73% AES [Bell et al., 1990] 93.69%
Dataset Splits No No explicit train/test/validation dataset splits are provided as the paper focuses on optimizing existing C/C++ projects and evaluating their execution time, rather than training a machine learning model on a dataset with defined splits.
Hardware Specification No No specific hardware details such as GPU/CPU models, memory, or specific computing environments used for running the experiments are mentioned in the paper.
Software Dependencies No We use the Callgrind profiling tool [Weidendorfer, 2012] for dynamic analysis. ... We use Clang s Lib Tooling [Team, 2007] to construct an Abstract Syntax Tree (AST) for each source file. ... To balance effectiveness and cost, we use GPT-4o [Open AI, 2024] as the default LLM agent. These mentions lack specific version numbers for Callgrind and Clang's Lib Tooling, and GPT-4o is an LLM model, not a versioned software library or tool in the traditional sense.
Experiment Setup Yes To balance effectiveness and cost, we use GPT-4o [Open AI, 2024] as the default LLM agent. We set the (temperature: 0.2) to ensure the results are as deterministic and reproducible as possible. ... For each code optimization, we adopt a Top-N selection criteria, i.e., generating N results and selecting the best one among them. In our experiments, N=5. ... For all projects, we conduct tests using both O0 and O3 flags. ... Each project is executed five times, and the average execute time is reported.