Agent-Oriented Planning in Multi-Agent Systems

Authors: Ao LI, Yuexiang Xie, Songze Li, Fugee Tsung, Bolin Ding, Yaliang Li

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate the advancement of AOP in solving real-world problems compared to both single-agent systems and existing planning strategies for multi-agent systems. Extensive experiments are conducted based on several reasoning datasets that require collaboration among multiple LLM-empowered agents. Comparisons between AOP and baseline methods demonstrate the remarkable advancements achieved by the proposed framework. Furthermore, we conduct an ablation study to show the contributions of different components in AOP
Researcher Affiliation Collaboration Ao Li1,2, Yuexiang Xie3 Songze Li4,5 Fugee Tsung1,2 Bolin Ding3 Yaliang Li3, 1The Hong Kong University of Science and Technology (Guangzhou) 2The Hong Kong University of Science and Technology 3Alibaba Group 4Southeast University 5Engineering Research Center of Blockchain Application, Supervision and Management (Southeast University), Ministry of Education
Pseudocode No The paper describes the AOP framework in Section 4, detailing its components and processes in prose. It includes figures like 'Overall architecture of AOP' (Figure 2) and prompt examples in the Appendix, but no explicit 'Pseudocode' or 'Algorithm' blocks are present within the main body or appendices.
Open Source Code Yes The source code is available at https://github.com/lalaliat/Agent-Oriented-Planning.
Open Datasets Yes We conduct experiments based on a numerical reasoning dataset (Kim et al., 2024), which necessitates the collaboration of multiple agents in resolving the queries. Following previous study (Kim et al., 2024), we adopt Husky QA, which consists of 1,440 queries in the training data and 292 queries in the test data. Besides, we also provide more experimental results on the decontextualized versions of a subset of DROP (Dua et al., 2019) and IIRC (Ferguson et al., 2020) in Appendix D.1.
Dataset Splits Yes Following previous study (Kim et al., 2024), we adopt Husky QA, which consists of 1,440 queries in the training data and 292 queries in the test data.
Hardware Specification Yes We train the reward model for 50 epochs on one Tesla V100-SXM2-32GB GPU.
Software Dependencies No The paper mentions using 'all-Mini LM-L6-v2' as embedding layers for the reward model and 'GPT-4o' as the LLM for agents, along with Python for code generation, but it does not specify version numbers for these software components or any other libraries used.
Experiment Setup Yes The batch size is set to 32, and the learning rate is 1e-3. We train the reward model for 50 epochs on one Tesla V100-SXM2-32GB GPU.