Optimal Epoch Stochastic Gradient Descent Ascent Methods for Min-Max Optimization
Authors: Yan Yan, Yi Xu, Qihang Lin, Wei Liu, Tianbao Yang
NeurIPS 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we bridge this gap by providing a sharp analysis of epoch-wise stochastic gradient descent ascent method (referred to as Epoch-GDA) for solving strongly convex strongly concave (SCSC) min-max problems, without imposing any additional assumption about smoothness or the function s structure. To the best of our knowledge, our result is the first one that shows Epoch-GDA can achieve the optimal rate of O(1/T) for the duality gap of general SCSC min-max problems. We emphasize that such generalization of Epoch-GD for strongly convex minimization problems to Epoch-GDA for SCSC min-max problems is non-trivial and requires novel technical analysis. |
| Researcher Affiliation | Collaboration | Yan Yan School of EECS Washington State University EMAIL Yi Xu Machine Intelligence Technology Alibaba Group US Inc EMAIL Qihang Lin Department of Business Analytics University of Iowa EMAIL Wei Liu Tencent AI Lab EMAIL Tianbao Yang Department of CS University of Iowa EMAIL |
| Pseudocode | Yes | Algorithm 1 Epoch-GDA for SCSC Min-Max Problems |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | No | The paper is theoretical and does not use or reference any datasets for training. |
| Dataset Splits | No | The paper is theoretical and does not describe dataset splits for validation. |
| Hardware Specification | No | The paper is theoretical and does not mention any specific hardware used for experiments. |
| Software Dependencies | No | The paper is theoretical and does not provide specific ancillary software details with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup with hyperparameters or system-level training settings. |