Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Simulating Human-like Daily Activities with Desire-driven Autonomy
Authors: Yiding Wang, Yuxuan Chen, Fangwei Zhong, Long Ma, Yizhou Wang
ICLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on Concordia, a text-based simulator, to demonstrate that our agent generates coherent, contextually relevant daily activities while exhibiting variability and adaptability similar to human behavior. A comparative analysis with other LLM-based agents demonstrates that our approach significantly enhances the rationality of the simulated activities. |
| Researcher Affiliation | Academia | Institute for Artificial Intelligence, Peking University The University of Hong Kong School of Artificial Intelligence, Beijing Normal University Academy for Advanced Interdisciplinary Studies, Peking University State Key Laboratory of General Artificial Intelligence, BIGAI Center on Frontiers of Computing Studies, School of Computer Science, Nat l Eng. Research Center of Visual Technology, Peking University |
| Pseudocode | No | The Desire-driven Autonomous Agent (D2A) performs four key procedures using the Value System and the Desire-driven Planner to generate the activity at: Qualitative Value Description, Activity Proposal, Activity Evaluation, and Activity Selection. These procedures are described in text but no formal pseudocode block or algorithm is provided. |
| Open Source Code | Yes | 1Project page: https://sites.google.com/view/desire-driven-autonomy |
| Open Datasets | No | We conduct experiments on Concordia, a text-based simulator... The environments are based on Concordia (Vezhnevets et al., 2023). The paper focuses on simulating activities within a self-developed environment and does not provide concrete access information for a publicly available or open dataset in the traditional sense. |
| Dataset Splits | No | We conducted 15 trials for each agent under identical initialization settings across all experiments. The paper describes simulation trials and environment setup rather than using predefined training/test/validation splits from a static dataset. |
| Hardware Specification | No | For all four agent-based methods, we used LLa MA3.1-70B as the default backbone model for both the agents and the environment controller. While specific software (LLMs) are mentioned, the paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments. |
| Software Dependencies | Yes | For all four agent-based methods, we used LLa MA3.1-70B as the default backbone model for both the agents and the environment controller. To evaluate the adaptability of our framework across different large language models, we tested it with Qwen 2.5:72B (Yang et al., 2024)... |
| Experiment Setup | Yes | Specifically, for the indoor, single-agent environment, we define 11 dimensions of desired value... To quantitatively track changes in the level of these desires and qualitatively translate numerical values into descriptive states, we apply a [0-10] Likert scale for each dimension... The initial value v0 d for each desire dimension is randomly selected from the range [0, 10]... In the Activity Proposal procedure... we prompt the agent to generate N candidates (also referred to as planner width, which is set to 3 in our default experimental configuration) activities... |