Are Convex Optimization Curves Convex?

Authors: Guy Barzilai, Ohad Shamir, Moslem Zamani

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this paper, we study when we might expect the optimization curve induced by gradient descent to be convex precluding, for example, an initial plateau followed by a sharp decrease, making it difficult to decide when optimization should stop. We show, perhaps surprisingly, that the answer crucially depends on the choice of the step size. In particular, for the range of step sizes which are known to result in monotonic convergence to an optimal value, we characterize a regime where the optimization curve will be provably convex, and a regime where the curve can be non-convex. We also extend our results to gradient flow, and to the closely-related but different question of whether the gradient norm decreases monotonically.
Researcher Affiliation Academia Guy Barzilai EMAIL Tel-Aviv University Ohad Shamir EMAIL Weizmann Institute of Science Moslem Zamani EMAIL UCLouvain
Pseudocode No The paper contains mathematical derivations, proofs, and definitions, but no structured pseudocode or algorithm blocks are present.
Open Source Code No The paper does not contain any explicit statements about the release of source code, nor does it provide links to any code repositories or mention code in supplementary materials.
Open Datasets No The paper is theoretical and does not use any datasets for experiments. The examples provided (e.g., f(x) = x^2, f(x) = |x| + max{0, x}) are for illustrative and theoretical proof purposes, not based on empirical data.
Dataset Splits No Since the paper is theoretical and does not use any datasets for empirical evaluation, there is no mention of dataset splits.
Hardware Specification No The paper is theoretical and does not involve empirical experiments requiring specific hardware. Therefore, no hardware specifications are mentioned.
Software Dependencies No The paper is theoretical, focusing on mathematical proofs and analysis. It does not mention any specific software or programming libraries with version numbers used for implementation or experimentation.
Experiment Setup No The paper focuses on theoretical analysis and proofs regarding optimization curves. It does not describe any empirical experiments or their setup, and thus no hyperparameters or system-level training settings are provided.