CaPo: Cooperative Plan Optimization for Efficient Embodied Multi-Agent Cooperation

Authors: Jie Liu, Pan Zhou, Yingjun Du, Ah-Hwee Tan, Cees G Snoek, Jan-jakob Sonke, Efstratios Gavves

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on the Three Dworld Multi-Agent Transport and Communicative Watch-And-Help tasks demonstrate Ca Po s much higher task completion rate and efficiency compared with state-of-the-arts.
Researcher Affiliation Academia 1University of Amsterdam, The Netherlands 2Singapore Management University, Singapore 3The Netherlands Cancer Institute , The Netherlands 4Archimedes/Athena RC, Greece
Pseudocode No The paper does not contain explicit pseudocode or algorithm blocks. It describes processes and prompt templates for LLMs, but not structured algorithms for the core methodology.
Open Source Code Yes The code is released at https://github.com/jliu4ai/Ca Po.
Open Datasets Yes We follow Co ELA, and adopt the Three Dworld Multi-Agent Transport (TDW-MAT) task (Zhang et al., 2023b), and the Communicative Watch-And-Help (C-WAH) task (Zhang et al., 2023b) to test our Ca Po.
Dataset Splits No The test set of TDW-MAT consists 24 episodes, which evenly divided into food and stuff tasks. In C-WAH, ... The test set contains 10 episodes, including both symbolic and visual observation settings. The paper mentions the size of the test sets but does not specify training/validation splits or overall dataset partitioning.
Hardware Specification No The paper mentions using specific LLMs (GPT3.5-turbo, GPT-4, LLAMA-2-13B-CHAT) but does not provide specific hardware details like GPU/CPU models or memory used for running the experiments or the embodied agents themselves.
Software Dependencies No The paper mentions using GPT3.5-turbo and GPT-4 from the Open AI API (Open AI, 2024), and LLAMA-2-13B-CHAT (Touvron et al., 2023), and Mask-RCNN (He et al., 2017) for perception, but does not provide specific version numbers for underlying software dependencies like programming languages (e.g., Python), frameworks (e.g., PyTorch/TensorFlow), or other libraries.
Experiment Setup Yes We set default parameters for LLMs: temperature of 0.7, a maximum of 256 output tokens, and top-1 sampling.