Shuffling Gradient-Based Methods for Nonconvex-Concave Minimax Optimization
Authors: Quoc Tran Dinh, Trang H. Tran, Lam Nguyen
NeurIPS 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform some experiments to illustrate Algorithm 1 and compare it with two existing and related algorithms. ... We test these algorithms on two datasets from LIBSVM [6]. ... The results are shown in Figure 1 for w8a and rcv1 datasets using kb = 32 blocks. |
| Researcher Affiliation | Collaboration | Quoc Tran-Dinh Department of Statistics and Operations Research The University of North Carolina at Chapel Hill EMAIL Trang H. Tran School of OR and Information Engineering Cornell University, Ithaca, NY EMAIL Lam M. Nguyen IBM Research, Thomas J. Watson Research Center Yorktown Heights, NY Lam EMAIL |
| Pseudocode | Yes | Algorithm 1 (Shuffling Proximal Gradient-Based Algorithm for Solving (10)) |
| Open Source Code | Yes | Our data is available online from LIBSVM. The code is implemented in Python. The code for all experiments is also provided with instruction. |
| Open Datasets | Yes | We test these algorithms on two datasets from LIBSVM [6]. |
| Dataset Splits | No | The paper states that full details are in Supp. Doc. D, but the main text does not specify exact training/validation/test dataset splits (e.g., percentages or counts). |
| Hardware Specification | Yes | Our experiments were run on a Mac Book Pro. 2.8GHz Quad-Core Intel Core I7, 16Gb Memory specified at the beginning of Supp. Doc. D. |
| Software Dependencies | No | The paper states that “The code is implemented in Python.” but does not specify version numbers for Python or any specific libraries or solvers used. |
| Experiment Setup | Yes | We set λ := 10 4 and update the smooothing parameter γt as γt := 1 2(t+1)1/3 . The learning rate for all algorithms is finely tuned from {100, 50, 10, 5, 1, 0.5, 0.1, 0.05, 0.01, 0.001, 0.0001}, and the results are shown in Figure 1 for w8a and rcv1 datasets using kb = 32 blocks. |