Sample-Based Tree Search with Fixed and Adaptive State Abstractions
Authors: Jesse Hostetler, Alan Fern, Thomas Dietterich
JAIR 2017 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This paper presents a theoretical and empirical evaluation of tree search with both fixed and adaptive state abstractions. We evaluate PARSS as well as sparse sampling with fixed abstractions on 12 experimental problems, and find that PARSS outperforms search with a fixed abstraction and that search with even highly inaccurate fixed abstractions outperforms search without abstraction. These results establish progressive abstraction refinement as a promising basis for new tree search algorithms, and we propose directions for future work within the progressive refinement framework. |
| Researcher Affiliation | Academia | Jesse Hostetler EMAIL Alan Fern EMAIL Thomas Dietterich EMAIL Oregon State University, Department of Electrical Engineering and Computer Science, Corvallis, OR 97331 USA |
| Pseudocode | Yes | Algorithm 1 Abstract SBTS Framework Algorithm 2 Abstract Sparse Sampling Algorithm 3 Abstract Forward Search Sparse Sampling Algorithm 4 Abstract Trajectory Sampling (with UCT Variation) Algorithm 5 A generic abstraction refinement procedure Algorithm 6 Modified Sample procedure for PARSS Algorithm 7 Progressive Abstraction Refinement for SS Algorithm 8 Modified Sample procedure for rand-FSSS |
| Open Source Code | Yes | The complete source code used in our experiments is available at https://github.com/jhostetler/jmcplan/releases/tag/v0.1. |
| Open Datasets | Yes | Sailing is based on a test domain used by Kocsis and Szepesvári (2006) and Jiang et al. (2014). [...] Racetrack is a classic domain introduced by Barto, Bradtke, and Singh (1995). [...] Academic Advising (Advising) is a modification of the IPC problem of the same name (Guerin, Hanna, Ferland, Mattei, & Goldsmith, 2012). We used MDP instance 1 from the IPC 2014. [...] Crossing Traffic is a grid navigation problem in which the agent must cross several lanes of traffic (obstacles that move right-to-left) without being hit. We used MDP instance 4 from the IPC 2014. [...] In Elevators, the agent controls one or more elevator cars and must use them to pick up and drop off passengers. We used MDP instance 7 from the IPC 2014. [...] In Tamarisk, the agent is trying to prevent the invasive tamarisk plant from colonizing a river system. We used MDP instance 2 from the IPC 2014. [...] Tetris is the classic videogame of stacking differently shaped blocks. It has quite a long history in AI research (e.g. Bertsekas & Ioffe, 1996; Gabillon, Ghavamzadeh, & Scherrer, 2013). |
| Dataset Splits | No | The paper describes simulation-based experiments using 'random trajectories' and 'random problem instances' from various domains (e.g., IPC problems). It does not specify explicit training, validation, or test dataset splits in the typical machine learning sense with percentages or sample counts. |
| Hardware Specification | No | The parameter search space Θ had to be curtailed because the search algorithms were exceeding the 16GB memory limit of our hardware. Our experiments were conducted on a shared computer cluster comprised of heterogeneous hardware |
| Software Dependencies | No | The paper mentions measuring time using the 'Java API function System.currentTimeMillis()' and using 'The R package scmamp (Calvo & Santafe, 2015)' for statistical comparison. However, it does not specify version numbers for the core Java development environment or any other key software libraries used to implement the algorithms described in the paper. |
| Experiment Setup | Yes | The problem is parameterized by integers pmin, pmax, Tb, Ti, Tm where pmin pmax and Ti, Tb, Tm > 0. [...] We instantiate the Saving problem with parameters pmin = 4, pmax = 4, Ti = 4, and Tb = 4, and with an episode length of 30. [...] We set g = 4, g = 2, η = 0.2, and η0 = 0.8. The episode length is 40, which is the same as the IPC version. [...] For most problems, C {1, 2, 5, 10, 20, 50}, and d {i, i + 1, . . . , i + m} where i and m are small integers. We expanded the range of C to {1, 2, 5, . . . , 100, 200} for Spanish Blackjack due to its large stochastic branching factor. The random branching factor B was varied over either {2, 3, 5} or {2, 4} depending on the problem. |