reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Partially Observable Reference Policy Programming

Authors: Edward Kim, Hanna Kurniawati

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical evaluations on two large-scale problems with dynamically evolving environments including a helicopter emergency scenario in the Corsica region requiring approximately 150 planning steps corroborate the theoretical results and indicate that our solver considerably outperforms current online benchmarks. and Experimental results indicate that PORPP substantially outperforms current state-of-the-art online POMDP benchmarks.
Researcher Affiliation	Academia	Edward Kim and Hanna Kurniawati Australian National University EMAIL
Pseudocode	Yes	Algorithm 1 PORPP and Algorithm 2 SIMULATE(h, s, depth)
Open Source Code	Yes	Technical details and proofs are contained in the Supplementary Material (https://github.com/RDLLab/pomdp-py-porpp). and 3See https://github.com/RDLLab/pomdp-py-porpp for the code and parameters used to run the experiments.
Open Datasets	No	The paper describes problem scenarios and custom environments ('3D Maze with Poor Localisation' and 'HEMS Mission with Evolving No-Fly-Zones'), and mentions that the terrain mesh for the HEMS mission was 'extracted from X-Plane 12'. However, it does not provide concrete access information (link, DOI, repository, or formal citation for a dataset) for any publicly available or open dataset used in the experiments.
Dataset Splits	No	The paper describes problem scenarios and conducts simulations ('100 runs' in tables) but does not involve explicit training/test/validation dataset splits typically found in data-driven machine learning experiments. Therefore, no specific dataset split information is provided.
Hardware Specification	Yes	All experiments were performed on a desktop computer with 128GB DDR4 RAM and an 8 Core Intel Xeon Silver 4110 Processor.
Software Dependencies	No	All solvers were implemented in the pomdp py library [H2RLab, 2024] and Cythonised for a fair comparison. While 'pomdp-py' and 'Cython' are mentioned, specific version numbers for these software components are not provided.
Experiment Setup	Yes	The discount factor for all environments was γ = 0.99. and Parameters: κA 0, αA (0, 1), Dmax 1, η > 0. and maximum macro action length = 10 (Table 1) and expands 16 macro actions (for POMCP benchmark).