Optimally Solving Dec-POMDPs as Continuous-State MDPs
Authors: Jilles Steeve Dibangoye, Christopher Amato, Olivier Buffet, François Charpillet
JAIR 2016 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We include an extensive empirical analysis using well-known benchmarks, thereby demonstrating that our approach provides significant scalability improvements compared to the state of the art. [...] This section empirically demonstrates and validates the importance of our feature-based heuristic search value iteration (FB-HSVI) algorithm. We show that FB-HSVI outperforms all existing exact algorithms on all tested domains from the literature and that FB-HSVI can solve those problems over unprecedented time horizons. |
| Researcher Affiliation | Academia | Jilles Steeve Dibangoye EMAIL Univ de Lyon INSA-Lyon, CITI-Inria, F-69621, France Christopher Amato EMAIL University of New Hampshire Durham, NH, USA Olivier Buffet EMAIL Franc ois Charpillet EMAIL Inria Universit e de Lorraine CNRS Villers-l es-Nancy, F-54600, France |
| Pseudocode | Yes | Algorithm 1: The OHSVI algorithm [...] Algorithm 2: The FB-HSVI Algorithm. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. It states "Our three FB-HSVI variants (Table 2) were implemented in the same framework..." but does not provide a link or explicit statement of code release. |
| Open Datasets | Yes | We selected benchmarks with the goal of spanning the range of properties that may affect the performance of a Dec-POMDP solver. In Table 3, we review the selected domains and their properties. These domains can be downloaded at http://masplan.org. |
| Dataset Splits | No | The paper evaluates algorithms on predefined Dec-POMDP problem instances (domains) like "Dec-Tiger" and "Recycling-Robots", for which the concept of training/test/validation dataset splits is not applicable. The entire problem instance serves as the 'data' for which an optimal policy is sought. |
| Hardware Specification | No | The paper mentions that experiments were conducted on different machines, but does not provide specific hardware details (e.g., CPU, GPU models, or memory specifications) used for running its experiments. |
| Software Dependencies | No | The paper states that its algorithms were 'implemented in the same framework' and refers to other existing algorithms, but does not provide specific software dependencies or version numbers (e.g., programming language versions, library versions, or solver versions). |
| Experiment Setup | Yes | We terminate FB-HSVI whenever the distance between lower and upper bounds is within ǫ = 0.01. A time limit was set to 1000ms. |