Stochastic Nested Variance Reduction for Nonconvex Optimization

Authors: Dongruo Zhou, Pan Xu, Quanquan Gu

JMLR 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Thorough experimental results on different nonconvex optimization problems back up our theory. ... In this section, we conduct experiments to validate the superiority of the proposed algorithms.
Researcher Affiliation Academia Dongruo Zhou EMAIL Department of Computer Science University of California, Los Angeles Los Angeles, CA 90095, USA Pan Xu EMAIL Department of Computer Science University of California, Los Angeles Los Angeles, CA 90095, USA Quanquan Gu EMAIL Department of Computer Science University of California, Los Angeles Los Angeles, CA 90095, USA
Pseudocode Yes Algorithm 1 One-epoch-SNVRG(x0, F, K, M, {Tl}, {Bl}, B0)... Algorithm 2 SNVRG(z0, F, K, M, {Tl}, {Bl}, B0, S)... Algorithm 3 SNVRG-PL(z0, F, K, M, {Tl}, {Bl}, B0, S, U)... Algorithm 4 SNVRG + Neon2finite(z0, F, K, M, {Tl}, {Bl}, B0, U, ϵ, ϵH, δ, η, L1, L2)... Algorithm 5 SNVRG + Neon2online(z0, F, K, M, {Tl}, {Bl}, U, ϵ, ϵH, δ, η, L1, L2)...
Open Source Code No The paper does not explicitly state that source code for the described methodology is publicly available. While it mentions implementation details like "All algorithm are implemented in Pytorch platform version 0.4.0 within Python 3.6.4.", it does not provide a repository link or an affirmative statement about releasing the code.
Open Datasets Yes We use three image datasets: (1) The MNIST dataset (Sch olkopf and Smola, 2002)... (2) CIFAR10 dataset (Krizhevsky, 2009)... (3) SVHN dataset Netzer et al. (2011)...
Dataset Splits Yes The MNIST dataset (Sch olkopf and Smola, 2002) consists of handwritten digits and has 50,000 training examples and 10,000 test examples... CIFAR10 dataset (Krizhevsky, 2009) consists of images in 10 classes and has 50,000 training examples and 10,000 test examples... SVHN dataset Netzer et al. (2011) consists of images of digits and has 531,131 training examples and 26,032 test examples.
Hardware Specification Yes All experiments are conducted on Amazon AWS p2.xlarge servers which comes with Intel Xeon E5 CPU and NVIDIA Tesla K80 GPU (12G GPU RAM).
Software Dependencies Yes All algorithm are implemented in Pytorch platform version 0.4.0 within Python 3.6.4.
Experiment Setup Yes For SGD, we search the batch size from {256, 512, 1024, 2048} and the initial step sizes from {1, 0.1, 0.01}. For SGD-momentum, we set the momentum parameter as 0.9. We search its batch size from {256, 512, 1024, 2048} and the initial learning rate from {1, 0.1, 0.01}. For ADAM, we search the batch size from {256, 512, 1024, 2048} and the initial learning rate from {0.01, 0.001, 0.0001}. For SCSG and SNVRG, we choose loop parameters {Tl} which satisfy Bl Ql j=1 Tj = B automatically. In addition, for SCSG, we set the batch sizes (B, B1) = (B, B/b), where b is the batch size ratio parameter. We search B from {256, 512, 1024, 2048} and we search b from {2, 4, 8}. We search its initial learning rate from {1, 0.1, 0.01}. For our proposed SNVRG algorithm, we set the batch sizes (B, B1, B2) = (B, B/b, B/b2), where b is the batch size ratio parameter. We search B from {256, 512, 1024, 2048} and b from {2, 4, 8}. We search its initial learning rate from {1, 0.1, 0.01}.