Planning with Critical Section Macros: Theory and Practice
Authors: Lukas Chrpa, Mauro Vallati
JAIR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide an extensive and detailed empirical evaluation on a broad range of domains. The experimental analysis is presented in Section 7. |
| Researcher Affiliation | Academia | Lukáš Chrpa EMAIL Faculty Of Electrical Engineering, Czech Technical University in Prague, Jugosl avsk ych partyz an u 1580/3, Prague, 160 00, Czechia Mauro Vallati EMAIL School of Computing and Engineering, University of Huddersfield, Queensgate, Huddersfield, HD1 3DH, United Kingdom |
| Pseudocode | Yes | Algorithm 1 Assembling Macro-actions from a sequence of actions ... Algorithm 2 Learning (Sp Sl) CSMs from training plans ... Algorithm 3 A high-level routine for learning Sp Sl CSMs ... Algorithm 4 A high-level routine for learning compound CSMs |
| Open Source Code | Yes | Our code and benchmarks can be found at: https://github.com/lchrpa/CSMs. |
| Open Datasets | Yes | We considered a range of well-known benchmark domains from both deterministic and learning tracks of IPCs. In particular: Elevators, Floortile, GED, Hiking, Termes, and Transport from the deterministic track of IPCs 2011, 2014 and 2018, and Barman, Blocksworld (Bw), Depots, Gold Miner (Gold), Gripper, Matching-Bw, Rovers, Sokoban, and Thoughtful from the learning track of IPCs 2008 and 2011. We have also considered the Storage domain from IPC 2006, that was used for evaluating the Blo Ma technique (Chrpa & Siddiqui, 2015). ... Since 1998, the International Planning Competition (IPC)1 has been organised ... 1. http://ipc.icaps-conference.org |
| Dataset Splits | Yes | As testing instances, for each domain we used those exploited in IPCs. There are 20 instances for the domains included in the deterministic tracks (except Storage), and 30 instances for the learning track benchmarks and Storage. ... we considered 6 training tasks per each domain such that their plan length was mostly within 30-80 actions4. One training plan was considered per training task. |
| Hardware Specification | Yes | All the experiments were conducted on Intel Xeon E5-2620 v4 2.10 GHz with 32GB RAM. |
| Software Dependencies | No | The paper lists specific planning engines by name (e.g., FF, LAMA, Probe, Mp C, Mercury, Yahsp3, FDSS 2018, Dual BFWS) and their corresponding citation, but does not provide specific version numbers for these or any other software libraries or dependencies. For example, it mentions 'FDSS 2018' but not its full version or other libraries with versions. |
| Experiment Setup | Yes | We considered 6 training tasks per each domain such that their plan length was mostly within 30-80 actions4. One training plan was considered per training task. For each individual domain, out of all considered planners, a planner which generates best quality training plans... is selected to generate training plans for that domain. ... The thresholds for underrepresented macros, ν1 and ν2 (see Algorithms 2,3 and 4) were set according to results of preliminary experiments. In particular, ν1 was set to maximum of 1/2 of the number of the training tasks and 1/3 occurrences of the most frequent macro, while ν2 was set to the number of training tasks. ... For each testing task a time limit of 900 seconds and a memory limit of 4 GB is applied |