Convex Programming for Estimation in Nonlinear Recurrent Models

Authors: Sohail Bahmani, Justin Romberg

JMLR 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the performance of the estimator by simulation on synthetic data. These numerical experiments also suggest the extent at which the imposed theoretical assumptions may be relaxed.
Researcher Affiliation Academia Sohail Bahmani EMAIL School of Electrical & Computer Engineering Georgia Institute of Technology Atlanta, GA 30332 Justin Romberg EMAIL School of Electrical & Computer Engineering Georgia Institute of Technology Atlanta, GA 30332
Pseudocode No The paper includes mathematical formulations, theorems, and proofs but does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not contain any explicit statement about providing open-source code or a link to a code repository.
Open Datasets No We evaluated the proposed estimator numerically on synthetic data in a setup similar to the experiments of (Oymak, 2019). ... B Rn p is generated randomly with i.i.d. standard normal entries.
Dataset Splits No The paper uses synthetic data and discusses generating 100 randomly generated instances of the problem for simulation, but does not describe any train/test/validation splits for a specific dataset.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU, GPU models) used for running the experiments.
Software Dependencies No For each choice of α and ρ, we solved (4) using Nesterov s Accelerated Gradient Method (AGM) (Nesterov, 1983; Nesterov, 2013, Section 2.2). The optimization task can be solved by the SGD as well.
Experiment Setup Yes In all of the experiments, we consider the dimensions to be n = 50, p = 100, and the time horizon to be T = 500. For α {0.2, 0.8} we choose A = αR with R being a uniformly distributed n n orthogonal matrix. Furthermore, B Rn p is generated randomly with i.i.d. standard normal entries. ... The nonlinearity in (1) is described by one of the functions... at ρ = 1 (i.e., linear activation), ρ = 0.5 (i.e., leaky Re LU activation with slope 0.5 over R 0), ρ = 0.3 (i.e., leaky Re LU activation with slope 0.3 over R 0), and ρ = 0 (i.e., Re LU activation). ... For the Gaussian model the step-size is set to 10 3, whereas for the heavy-tailed model the step-size is set to 10 4. In each trial, the AGM is run for a maximum of 500 iterations and terminated only if the relative error dropped below 10 8 (i.e., b C C 2 F/ C 2 F 10 8).