Stochastic Nested Variance Reduction for Nonconvex Optimization
Authors: Dongruo Zhou, Pan Xu, Quanquan Gu
NeurIPS 2018 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we compare our algorithm SNVRG with other baseline algorithms on training a convolutional neural network for image classification. We plotted the training loss and test error for different algorithms on each dataset in Figure 3. |
| Researcher Affiliation | Academia | Dongruo Zhou Department of Computer Science University of California, Los Angeles Los Angeles, CA 90095 EMAIL Pan Xu Department of Computer Science University of California, Los Angeles Los Angeles, CA 90095 EMAIL Quanquan Gu Department of Computer Science University of California, Los Angeles Los Angeles, CA 90095 EMAIL |
| Pseudocode | Yes | Algorithm 1 One-epoch-SNVRG(x0, F, K, M, {Tl}, {Bl}, B), Algorithm 2 SNVRG, Algorithm 3 SNVRG-PL |
| Open Source Code | No | The paper mentions the implementation environment ('All algorithm are implemented in Pytorch platform version 0.4.0 within Python 3.6.4.') but does not provide any statement about releasing the source code or a link to a repository for the described methodology. |
| Open Datasets | Yes | We use three image datasets: (1) The MNIST dataset [42] consists of handwritten digits and has 50, 000 training examples and 10, 000 test examples. (2) CIFAR10 dataset [22] consists of images in 10 classes and has 50, 000 training examples and 10, 000 test examples. (3) SVHN dataset [33] consists of images of digits and has 531, 131 training examples and 26, 032 test examples. |
| Dataset Splits | No | The paper provides training and test set sizes for the datasets (e.g., '50,000 training examples and 10,000 test examples' for MNIST), but does not explicitly mention a validation split or details for reproducing such a split. |
| Hardware Specification | Yes | All experiments are conducted on Amazon AWS p2.xlarge servers which comes with Intel Xeon E5 CPU and NVIDIA Tesla K80 GPU (12G GPU RAM). |
| Software Dependencies | Yes | All algorithm are implemented in Pytorch platform version 0.4.0 within Python 3.6.4. |
| Experiment Setup | Yes | For SGD, we search the batch size from {256, 512, 1024, 2048} and the initial step sizes from {1, 0.1, 0.01}. Following the convention of deep learning practice, we apply learning rate decay schedule to each algorithm with the learning rate decayed by 0.1 every 20 epochs. |