On Markov Chain Gradient Descent
Authors: Tao Sun, Yuejiao Sun, Wotao Yin
NeurIPS 2018 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present two kinds of numerical results. The first one is to show that MCGD uses fewer samples to train both a convex model and a nonconvex model. The second one demonstrates the advantage of the faster mixing of a non-reversible Markov chain. Our results on nonconvex objective and non-reversible chains are new. |
| Researcher Affiliation | Academia | Tao Sun College of Computer National University of Defense Technology Changsha, Hunan 410073, China EMAIL Yuejiao Sun Department of Mathematics University of California, Los Angeles Los Angeles, CA 90095, USA EMAIL Wotao Yin Department of Mathematics University of California, Los Angeles Los Angeles, CA 90095, USA EMAIL |
| Pseudocode | No | The paper describes algorithms mathematically using equations (e.g., (2), (3), (5)), but does not present them in a structured pseudocode or algorithm block. |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for their methodology is openly available. |
| Open Datasets | No | The paper describes generating data for its experiments (e.g., 'Randomly sample a vector u Rd, d = 50' and 'construct an undirected connected graph with n = 20 nodes with edges randomly generated') rather than using a publicly available dataset with concrete access information or citations. |
| Dataset Splits | No | The paper does not explicitly provide details about validation dataset splits. It discusses 'training' models but not a separate 'validation' set. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., Python 3.x, PyTorch x.x). |
| Experiment Setup | Yes | We choose γk = 1 kq as our stepsize, where q = 0.501. This choice is consistently with our theory below. |