Stochastic Online Optimization using Kalman Recursion
Authors: Joseph de Vilmarest, Olivier Wintenberger
JMLR 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we demonstrate numerically the competitiveness of the static EKF for logistic regression in Section 6.6. Experiments We experiment the static EKF for logistic regression. Precisely, we compare the following sequential algorithms that we all initialize at 0: ... We evaluate the different algorithms with the mean squared error E[ ˆθt θ 2] that we approximate by its empirical version on 100 samples. We display the results in Figure 2.6.2 Real Data Sets To illustrate better the robustness to misspecification, we run the same procedures on real data sets: Forest cover-type (Blackard and Dean, 1999): ... Adult income (Kohavi, 1996): ... We evaluate through an empirical version of E[L(ˆθn)] L(θ ) estimated on 100 samples and where L is estimated on the test set, see Figure 3. |
| Researcher Affiliation | Collaboration | Joseph de Vilmarest joseph.de EMAIL Laboratoire de Probabilit es, Statistique et Mod elisation, Sorbonne Universit e, CNRS 4 place Jussieu, 75005 Paris, France and Electricit e de France R&D, Olivier Wintenberger EMAIL Laboratoire de Probabilit es, Statistique et Mod elisation, Sorbonne Universit e, CNRS 4 place Jussieu, 75005 Paris, France |
| Pseudocode | Yes | Algorithm 1: Static Extended Kalman Filter for Generalized Linear Model, Algorithm 2: Recursive updates of the ONS and the static EKF, Algorithm 3: Truncated Extended Kalman Filter for Logistic Regression |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code, nor any link to a code repository. |
| Open Datasets | Yes | Forest cover-type (Blackard and Dean, 1999): the feature vector is of dimension d = 54, and as it is a multi-class task (7 classes) we focus on classifying 2 versus all others. There are n = 581012 instances and we randomly split in two halves for training and testing. Adult income (Kohavi, 1996): the objective is to predict whether a person s annual income is smaller or bigger than 50K. There are 14 explanatory variables, and we obtain d = 98 once categorical variables are transformed into binary variables. We use the canonical split between training (32561 instances) and testing (16281 instances). |
| Dataset Splits | Yes | Forest cover-type (Blackard and Dean, 1999): ... There are n = 581012 instances and we randomly split in two halves for training and testing. Adult income (Kohavi, 1996): ... We use the canonical split between training (32561 instances) and testing (16281 instances). |
| Hardware Specification | No | The paper does not provide specific hardware details used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependency details with version numbers. |
| Experiment Setup | Yes | We experiment the static EKF for logistic regression. Precisely, we compare the following sequential algorithms that we all initialize at 0: ... We take the default value P1 = Id along with the value β = 0.49 suggested by Bercu et al. (2020). ... The convex region of search is a ball centered in 0 and of radius Dθ = 1.1 θ , a setting where we have good knowledge of θ . We consider two choices of the exp-concavity constant on which the ONS crucially relies to define the gradient step size. First, we use the only available bound e DθDX. Second, in the settings where the step size is so small that the ONS doesn t move, we use the exp-concavity constant κ0 at θ . ... First we test the choice of the gradient step size γ = 1/(2D2 X N) denoted by ASGD and a second version with γ = θ /(DX N) denoted by ASGD oracle. |