Stability of Controllers for Gaussian Process Dynamics
Authors: Julia Vinogradska, Bastian Bischoff, Duy Nguyen-Tuong, Jan Peters
JMLR 2017 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical evaluations on simulated benchmark problems support our theoretical results. |
| Researcher Affiliation | Collaboration | 1Corporate Research, Robert Bosch GmbH Robert-Bosch-Campus 1 71272 Renningen 2Intelligent Autonomous Systems Lab, Technische Universität Darmstadt Hochschulstraße 10 64289 Darmstadt |
| Pseudocode | Yes | Algorithm 1 Stability region Xc for GP mean dynamics Input: dynamics GP f, control policy π, xd, γ Output: stability region Xc; Algorithm 2 Stability region for GP dynamics Input: dynamics GP f, control policy π, time horizon T, target region Q, approximation error tolerance etol, desired success probability 1 λ Output: stability region Xc; Algorithm 3 Construction of composed quadrature rules Input: dynamics GP f : x(t), u(t) 7 x(t+1), control policy π: x 7 πθ(x) with parameters θ, state space X, maximum partition size Lmax Output: composed quadrature rule with nodes X and weight vector w |
| Open Source Code | No | The paper does not provide an explicit statement or link to its own source code for the methodology described. |
| Open Datasets | Yes | Mountain Car. A car with limited engine power has to reach a desired point in the mountainscape (Sutton and Barto, 1998). Inverted Pendulum. In the inverted pendulum task, the goal is to bring the pendulum to an upright position with limited torque (see Doya, 2000) and balance it there. Cart-Pole. In the cart-pole domain (Deisenroth et al., 2015), a cart with an attached free-swinging pendulum is running on a track of limited length. |
| Dataset Splits | No | The paper mentions that the GP dynamics model was trained on 250 data points from trajectories with random starting points and control gains for Mountain Car, 200 points for Inverted Pendulum, and 250 points for Cart-Pole. However, it does not specify how these data points were split into training, validation, or test sets. |
| Hardware Specification | No | Please note also that all necessary computations for the proposed approach can be executed in parallel. Thus, we conduct these computations on a GPU, which leads to a significant speedup and overall computation time comparable to the 2D examples ( 140s). |
| Software Dependencies | No | The paper references a quadrature rule CN:3-1 (Stroud, 1971; code from Burkardt, 2014) but does not provide specific version numbers for any software libraries, programming languages (other than general mentions), or solvers used in the implementation. |
| Experiment Setup | Yes | Mountain Car. ... We analyze stability of a PD-controller π((x, x) ) = Kpx + Kd x. The gains are chosen as Kp = 25 and Kd = 1 and the control signal is limited to umax = 4. Inverted Pendulum. ... We evaluate stability of a PD-controller with Kp = 6, Kd = 3 and control limit umax = 1.2. |