Regret Analysis of Multi-task Representation Learning for Linear-Quadratic Adaptive Control

Authors: Bruce D. Lee, Leonardo F. Toso, Thomas T. Zhang, James Anderson, Nikolai Matni

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 Numerical Validation We present numerical results to validate our bounds. In particular, we compare multi-task representation learning approach for the adaptive LQR design (Algorithm 1) over the setting where a single system attempts to learn its dynamics by using its local simulation data and computes a CE controller on top of the estimated model.
Researcher Affiliation Academia 1Department of Electrical and Systems Engineering, University of Pennsylvania 2Department of Electrical Engineering, Columbia University EMAIL, EMAIL
Pseudocode Yes Algorithm 1 Shared-Representation Certainty-Equivalent Control with Continual Exploration Algorithm 2 Least squares: LS(ˆ!, x1:t+1, u1:t) Algorithm 3 De-bias & Feature Whiten: DFW(ˆ!, x(1:H) 1:t , u(1:H) 1:t , N)
Open Source Code No The paper does not provide any concrete access to source code for the methodology described.
Open Datasets No We generate H (A(h) ω , B(h) ω ), by first considering a set of nominal cartpole parameters: c(1) p = (0.4, 1.0, 1.0), c(2) p = (1.6, 1.3, 0.3), c(3) p = (1.3, 0.7, 0.65), c(4) p = (0.2, 0.055, 1.36), and c(5) p = (0.2, 0.47, 1.825). We then perturb such parameters with a random scalar within the interval (0, 0.1) to generate different cartpole parameters c(h) p . With the system matrices (A(h) ω , B(h) ω ) in hands, for all h [H], we generate the disturbance signal as w(h) t N(0, 0.01Id X).
Dataset Splits No The paper describes generating data for numerical validation and running experiments for a certain number of timesteps and tasks, but it does not specify explicit training/test/validation dataset splits.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup Yes Figure 1: Regret of Algorithm 1 with varying number of tasks H. We consider kfin = 10 epochs with initial epoch length ς1 = 30, an exploratory sequence scaling as ε2 k 1 2k , state and controller bounds xb = 25, and Kb = 15, and random !0 with d(!0, !ω) 0.99. ... We set the gravity g = 1 and perform the discretization of (9) with step-size 0.25. ... we generate the disturbance signal as w(h) t N(0, 0.01Id X) and set the step-size and number of iterations of Algorithm 3 as φ = 0.25, and N = 1000.