Convergence of Distributed Adaptive Optimization with Local Updates
Authors: Ziheng Cheng, Margalit Glasgow
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | for the first time, we prove that Local SGD with momentum (Local SGDM) and Local Adam can outperform their minibatch counterparts in convex and weakly convex settings in certain regimes, respectively. Our analysis relies on a novel technique to prove contraction during local iterations, which is a crucial yet challenging step to show the advantages of local updates, under generalized smoothness assumption and gradient clipping strategy. |
| Researcher Affiliation | Academia | Ziheng Cheng University of California, Berkeley ziheng EMAIL Margalit Glasgow Massachusetts Institute of Technology EMAIL |
| Pseudocode | Yes | Local Adam is shown in Algorithm 1, which is a natural extension of centralized Adam (Kingma & Ba, 2014). |
| Open Source Code | No | The paper does not contain any explicit statements about the release of source code, nor does it provide any links to code repositories. |
| Open Datasets | No | The paper is theoretical and discusses general 'data distribution D' and 'stochastic gradient oracle F(x; ΞΎ)' but does not mention or provide access information for any specific, publicly available datasets. |
| Dataset Splits | No | As the paper is theoretical and does not describe experiments on specific datasets, there is no mention of training/test/validation dataset splits. |
| Hardware Specification | No | The paper focuses on theoretical convergence analysis and does not describe any experimental hardware used. |
| Software Dependencies | No | The paper does not specify any software dependencies or version numbers, as it is primarily a theoretical work. |
| Experiment Setup | No | The paper is theoretical, providing convergence proofs and analysis. It does not include an experimental setup section or details on hyperparameters or training configurations. |