Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models
Authors: Cong Lu, Shengran Hu, Jeff Clune
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our algorithm on a diverse range of language and vision-based tasks that require search and exploration. Across these tasks, IGE strongly exceeds classic reinforcement learning and graph search baselines, and also succeeds where prior state-of-the-art FM agents like Reflexion completely fail. Overall, INTELLIGENT GO-EXPLORE combines the tremendous strengths of FMs and the powerful Go-Explore algorithm, opening up a new frontier of research into creating more generally capable agents with impressive exploration capabilities. All our code is open-sourced at: https://github.com/conglu1997/ intelligent-go-explore. |
| Researcher Affiliation | Academia | Cong Lu1,2 EMAIL Shengran Hu1,2 EMAIL Jeff Clune1,2,3 EMAIL 1University of British Columbia 2Vector Institute 3Canada CIFAR AI Chair |
| Pseudocode | Yes | We illustrate our resultant algorithm at the top of Figure 1 and provide full pseudocode in Algorithm 1. |
| Open Source Code | Yes | All our code is open-sourced at: https://github.com/conglu1997/ intelligent-go-explore. |
| Open Datasets | Yes | We first demonstrate the effectiveness of IGE in a mathematical reasoning task, Game of 24 (Yao et al., 2023a). The goal is to perform basic arithmetic operations (+, , , /) starting from 4 numbers to obtain 24. ... Next, we show that IGE readily operates across multiple modalities in the Baby AI domains from Carta et al. (2023). ... Finally, we show IGE s ability to tackle tasks requiring long-horizon memory and planning, exploration, and commonsense in Text World (Cˆot e et al., 2018), a classic text-based agent benchmark. |
| Dataset Splits | Yes | We evaluate IGE across 100 hard test problems in Figure 2 |
| Hardware Specification | No | We used GPT-4-Turbo for Game of 24 and GPT-4o for Baby AI and Text World. This was purely done to select the version of GPT-4 that was available and the cheapest at the time of running the experiments. The version of GPT-4 is consistent per environment. |
| Software Dependencies | No | The paper mentions using specific versions of large language models like GPT-4-Turbo and GPT-4o, but does not specify other ancillary software dependencies like programming languages (e.g., Python), libraries (e.g., PyTorch, TensorFlow), or their corresponding version numbers. |
| Experiment Setup | Yes | Full hyperparameters are detailed in Appendix E. We list the hyperparameters for IGE in Table 6. We list the sampling parameters for GPT-4 (Open AI, 2024) passed via the Open AI API in Table 7. |