Partially Observable Reference Policy Programming

Authors: Edward Kim, Hanna Kurniawati

IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical evaluations on two large-scale problems with dynamically evolving environments including a helicopter emergency scenario in the Corsica region requiring approximately 150 planning steps corroborate the theoretical results and indicate that our solver considerably outperforms current online benchmarks. and Experimental results indicate that PORPP substantially outperforms current state-of-the-art online POMDP benchmarks.
Researcher Affiliation Academia Edward Kim and Hanna Kurniawati Australian National University EMAIL
Pseudocode Yes Algorithm 1 PORPP and Algorithm 2 SIMULATE(h, s, depth)
Open Source Code Yes Technical details and proofs are contained in the Supplementary Material (https://github.com/RDLLab/pomdp-py-porpp). and 3See https://github.com/RDLLab/pomdp-py-porpp for the code and parameters used to run the experiments.
Open Datasets No The paper describes problem scenarios and custom environments ('3D Maze with Poor Localisation' and 'HEMS Mission with Evolving No-Fly-Zones'), and mentions that the terrain mesh for the HEMS mission was 'extracted from X-Plane 12'. However, it does not provide concrete access information (link, DOI, repository, or formal citation for a dataset) for any publicly available or open dataset used in the experiments.
Dataset Splits No The paper describes problem scenarios and conducts simulations ('100 runs' in tables) but does not involve explicit training/test/validation dataset splits typically found in data-driven machine learning experiments. Therefore, no specific dataset split information is provided.
Hardware Specification Yes All experiments were performed on a desktop computer with 128GB DDR4 RAM and an 8 Core Intel Xeon Silver 4110 Processor.
Software Dependencies No All solvers were implemented in the pomdp py library [H2RLab, 2024] and Cythonised for a fair comparison. While 'pomdp-py' and 'Cython' are mentioned, specific version numbers for these software components are not provided.
Experiment Setup Yes The discount factor for all environments was γ = 0.99. and Parameters: κA 0, αA (0, 1), Dmax 1, η > 0. and maximum macro action length = 10 (Table 1) and expands 16 macro actions (for POMCP benchmark).