reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Continuous Time Analysis of Momentum Methods

Authors: Nikola B. Kovachki, Andrew M. Stuart

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through these approximation theorems, and accompanying numerical experiments, we make the following contributions to the understanding of momentum methods as often implemented within machine learning: We provide numerical experiments which illustrate the foregoing considerations, for simple linear test problems, and for the MNIST digit classiﬁcation problem; in the latter case we consider SGD and thereby demonstrate that the conclusions of our theory have relevance for understanding the stochastic setting as well. To demonstrate that our analysis is indeed relevant in the stochastic setting, we train a deep autoencoder with mini-batching (stochastic) and verify that our convergence results still hold. The details of this experiment are given in section 5.
Researcher Affiliation	Academia	Nikola B. Kovachki EMAIL Computing and Mathematical Sciences California Institute of Technology Pasadena, CA 91125, USA; Andrew M. Stuart EMAIL Computing and Mathematical Sciences California Institute of Technology Pasadena, CA 91125, USA
Pseudocode	No	The paper describes various optimization methods and numerical schemes using mathematical equations (e.g., equations (6), (7), (9), (10), (15), (35), (36)) and detailed textual explanations, but it does not include any explicitly labeled pseudocode or algorithm blocks with structured, code-like steps.
Open Source Code	No	The paper does not contain an explicit statement about releasing source code for the methodology described, nor does it provide a link to a code repository.
Open Datasets	Yes	We train a deep autoencoder, using the architecture of Hinton and Salakhutdinov (2006) on the MNIST dataset Le Cun and Cortes (2010).
Dataset Splits	No	Since our work is concerned only with optimization and not generalization, we present our results only on the training set of 60,000 images and ignore the testing set.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments, such as GPU/CPU models or memory specifications.
Software Dependencies	No	The paper mentions "Widely used deep learning libraries such as TensorFlow Abadi et al. (2015) and Py Torch Paszke et al. (2017)" in a general context but does not specify the versions of any software dependencies used for their own experiments.
Experiment Setup	Yes	We ﬁx an initialization of the autoencoder following Glorot and Bengio (2010) and use it to test every optimization method. Furthermore, we ﬁx a batch size of 200 and train for 500 epochs, not shuﬄing the data set during training so that each method sees the same realization of the noise. We use the mean-squared error as our loss function. We were unable to train the autoencoder using (35) with h = 1 since λ = 0.9 implies an eﬀective learning rate of 10 for which the system blows up. Since deep neural networks are not strongly convex, there is no single optimal choice of µ; we simply set µ = 1 in our experiments.