GRAML: Goal Recognition As Metric Learning
Authors: Matan Shamir, Reuth Mirsky
IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Evaluated on a versatile set of environments, GRAML shows speed, flexibility, and runtime improvements over the state-of-the-art GR while maintaining accurate recognition. |
| Researcher Affiliation | Academia | Matan Shamir1 , Reuth Mirsky1,2 1Computer Science Department, Bar-Ilan University, Israel 2Computer Science Department, Tufts University, MA, USA EMAIL, EMAIL |
| Pseudocode | No | The paper describes steps in regular paragraph text without structured formatting, and no figures are labeled as pseudocode or algorithm. |
| Open Source Code | Yes | 1https://github.com/Matan Shamir1/Grlib |
| Open Datasets | Yes | Building on the GCRL survey and the benchmark environments suggested at Apex RL 2, we form a collection of GR problems from several sets of environments that adhere to the Gymnasium API 3, with detailed descriptions of each in Appendix ??. We consider two custom Minigrid environments from the minigrid package [Chevalier-Boisvert et al., 2023], two custom Point Maze environments from the Gymnasium-Robotics package [Fu et al., 2020], the Parking environment from the highway-env package [Leurent, 2018], and the Reach environment from Panda Gym [Gallou edec et al., 2021]. |
| Dataset Splits | No | The paper mentions varying observation sequence lengths (30%, 50%, 70%, 100%) and generating '200 GR problems per scenario' but does not specify training, validation, or test dataset splits in a way that allows reproduction of data partitioning. |
| Hardware Specification | Yes | All experiments were conducted on a commodity Intel i-7 pro. |
| Software Dependencies | No | The paper mentions software like Python, PyTorch, Stable Baselines3, Gymnasium API, minigrid package, Gymnasium-Robotics package, highway-env package, and Panda Gym, but does not provide specific version numbers for these components. |
| Experiment Setup | Yes | Each single-goal agent was trained for 300,000 timesteps, and the goal-conditioned agent was trained for 1 million timesteps. ... G was set to 20, while BG-GRAML used only 5. ... For each environment, we tested observation sequences that are 30%, 50%, 70%, and 100% of the full sequence, both consecutively and non-consecutively. |