Convergence of Unregularized Online Learning Algorithms
Authors: Yunwen Lei, Lei Shi, Zheng-Chu Guo
JMLR 2017 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper we study the convergence of online gradient descent algorithms in reproducing kernel Hilbert spaces (RKHSs) without regularization. We establish a sufficient condition and a necessary condition for the convergence of excess generalization errors in expectation. A sufficient condition for the almost sure convergence is also given. With high probability, we provide explicit convergence rates of the excess generalization errors for both averaged iterates and the last iterate, which in turn also imply convergence rates with probability one. |
| Researcher Affiliation | Academia | Yunwen Lei EMAIL Shenzhen Key Laboratory of Computational Intelligence Department of Computer Science and Engineering Southern University of Science and Technology Shenzhen, 518055, China and Department of Mathematics City University of Hong Kong Kowloon, Hong Kong, China Lei Shi EMAIL School of Mathematical Sciences Shanghai Key Laboratory for Contemporary Applied Mathematics Fudan University Shanghai, 200433, China Zheng-Chu Guo EMAIL School of Mathematical Sciences Zhejiang University Hangzhou, 310027, China |
| Pseudocode | No | The paper describes the online gradient descent update rule as a mathematical formula (1.1): ft+1 = ft ηtφ (yt, ft(xt))Kxt, t N, but this is not presented as a structured pseudocode block or algorithm section. |
| Open Source Code | No | The paper only provides a license link: 'License: CC-BY 4.0, see https://creativecommons.org/licenses/by/4.0/. Attribution requirements are provided at http://jmlr.org/papers/v18/17-457.html.' It does not contain any explicit statements or links to source code for the described methodology. |
| Open Datasets | No | The paper discusses 'training examples {zt = (xt, yt)}t N are sequentially and identically drawn from a probability measure ρ' and 'Mercer kernel K : X X R', but these are theoretical constructs for the mathematical analysis, not specific datasets used in experiments. No concrete access information for any dataset is provided. |
| Dataset Splits | No | The paper is theoretical and does not conduct experiments on specific datasets. Therefore, no information regarding training/test/validation dataset splits is provided. |
| Hardware Specification | No | The paper is theoretical and focuses on mathematical analysis and proofs of convergence. It does not describe any empirical experiments or the hardware used to run them. |
| Software Dependencies | No | The paper is theoretical and focuses on mathematical analysis and proofs of convergence. It does not describe any implementation details, specific software, or their version numbers. |
| Experiment Setup | No | The paper is theoretical and focuses on mathematical analysis and proofs. It does not describe any empirical experimental setup, specific hyperparameter values, or system-level training settings. |