Regret Analysis of Multi-task Representation Learning for Linear-Quadratic Adaptive Control
Authors: Bruce D. Lee, Leonardo F. Toso, Thomas T. Zhang, James Anderson, Nikolai Matni
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Numerical Validation We present numerical results to validate our bounds. In particular, we compare multi-task representation learning approach for the adaptive LQR design (Algorithm 1) over the setting where a single system attempts to learn its dynamics by using its local simulation data and computes a CE controller on top of the estimated model. |
| Researcher Affiliation | Academia | 1Department of Electrical and Systems Engineering, University of Pennsylvania 2Department of Electrical Engineering, Columbia University EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1 Shared-Representation Certainty-Equivalent Control with Continual Exploration Algorithm 2 Least squares: LS(ˆ!, x1:t+1, u1:t) Algorithm 3 De-bias & Feature Whiten: DFW(ˆ!, x(1:H) 1:t , u(1:H) 1:t , N) |
| Open Source Code | No | The paper does not provide any concrete access to source code for the methodology described. |
| Open Datasets | No | We generate H (A(h) ω , B(h) ω ), by first considering a set of nominal cartpole parameters: c(1) p = (0.4, 1.0, 1.0), c(2) p = (1.6, 1.3, 0.3), c(3) p = (1.3, 0.7, 0.65), c(4) p = (0.2, 0.055, 1.36), and c(5) p = (0.2, 0.47, 1.825). We then perturb such parameters with a random scalar within the interval (0, 0.1) to generate different cartpole parameters c(h) p . With the system matrices (A(h) ω , B(h) ω ) in hands, for all h [H], we generate the disturbance signal as w(h) t N(0, 0.01Id X). |
| Dataset Splits | No | The paper describes generating data for numerical validation and running experiments for a certain number of timesteps and tasks, but it does not specify explicit training/test/validation dataset splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | Figure 1: Regret of Algorithm 1 with varying number of tasks H. We consider kfin = 10 epochs with initial epoch length ς1 = 30, an exploratory sequence scaling as ε2 k 1 2k , state and controller bounds xb = 25, and Kb = 15, and random !0 with d(!0, !ω) 0.99. ... We set the gravity g = 1 and perform the discretization of (9) with step-size 0.25. ... we generate the disturbance signal as w(h) t N(0, 0.01Id X) and set the step-size and number of iterations of Algorithm 3 as φ = 0.25, and N = 1000. |