reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

DESPOT: Online POMDP Planning with Regularization

Authors: Nan Ye, Adhiraj Somani, David Hsu, Wee Sun Lee

JAIR 2017 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The algorithm demonstrates strong experimental results, compared with some of the best online POMDP algorithms available. It has also been incorporated into an autonomous driving system for real-time vehicle control. The source code for the algorithm is available online. ... Experiments show that the anytime DESPOT algorithm is successful on very large POMDPs with up to 1056 states.
Researcher Affiliation	Academia	Nan Ye EMAIL ACEMS & Queensland University of Technology, Australia Adhiraj Somani EMAIL David Hsu EMAIL Wee Sun Lee EMAIL National University of Singapore, Singapore
Pseudocode	Yes	Appendix B. Pseudocode for Anytime DESPOT ... Algorithm 6 Anytime DESPOT
Open Source Code	Yes	The source code for the algorithm is available online. ... 1. The source code for the algorithm is available at http://bigbird.comp.nus.edu.sg/pmwiki/farm/appl/.
Open Datasets	Yes	Tag is a standard POMDP benchmark introduced by Pineau et al. (2003). ... Next we consider Rock Sample, a well-established benchmark with a large state space (Smith & Simmons, 2004). ... Pocman (Silver & Veness, 2010) is a partially observable variant of the popular video game Pacman (Figure 4d).
Dataset Splits	No	For each algorithm, we tuned the key parameters on each domain through ofﬂine training, using a data set distinct from the online test data set, as we expect this to be the common usage mode for online planning. ... The online POMDP algorithms were given exactly 1 second per step to choose an action.
Hardware Specification	No	No specific hardware details (GPU models, CPU models, memory, or cloud instance types) are provided in the paper for running the experiments. The paper only mentions that "All algorithms were implemented in C++." and the time limit for online planning.
Software Dependencies	No	We implemented DESPOT and AEMS2 ourselves. We used the authors implementation of POMCP (Silver & Veness, 2010), but improved the implementation to support a very large number of observations and strictly adhere to the time limit for online planning. We used the APPL package for SARSOP (Kurniawati et al., 2008). All algorithms were implemented in C++.
Experiment Setup	Yes	Speciﬁcally, the regularization parameter λ for DESPOT was selected ofﬂine from the set {0, 0.01, 0.1, 1, 10} by running the algorithm with a training set distinct from the online test set. Similarly, the exploration constant c of POMCP was chosen from the set {1, 10, 100, 1000, 10000} for the best performance. ... Speciﬁcally, we chose ξ = 0.95 as in SARSOP (Kurniawati et al., 2008). We chose D = 90 for DESPOT because γD 0.01 when γ = 0.95, which is the typical discount factor used. We chose K = 500, but a smaller value may work as well.