Losing Momentum in Continuous-time Stochastic Optimisation
Authors: Kexin Jin, Jonas Latz, Chenguang Liu, Alessandro Scagliotti
JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In experiments, we study our scheme in convex and non-convex test problems. Additionally, we train a convolutional neural network in an image classification problem. Our algorithm attains competitive results compared to stochastic gradient descent with momentum. |
| Researcher Affiliation | Academia | Kexin Jin EMAIL Department of Mathematics Princeton University Princeton, NJ 08544-1000, USA; Jonas Latz EMAIL Department of Mathematics The University of Manchester Manchester, M13 9PL, United Kingdom; Chenguang Liu EMAIL Delft Institute of Applied Mathematics Technische Universiteit Delft 2628 Delft, The Netherlands; Alessandro Scagliotti EMAIL CIT School, Technische Universit at M unchen |
| Pseudocode | No | The paper includes mathematical equations for algorithms (e.g., (9), (10), (36)) but does not present them in a structured pseudocode block or algorithm section. |
| Open Source Code | No | The paper does not contain an explicit statement about the release of source code for the methodology described, nor does it provide a direct link to a code repository. |
| Open Datasets | Yes | We now employ the discrete SGMP method (36) to solve the CIFAR-10 (Krizhevsky, 2009), image classification task with a convolutional neural network (CNN). |
| Dataset Splits | Yes | The CIFAR-10 data set consists of 6 104 colour images (32 32 pixels) which are split into 5 104 training images and 104 test images. |
| Hardware Specification | Yes | The training is done with Google Colab using GPUs (often Tesla V100, sometime Tesla A100). |
| Software Dependencies | No | The paper mentions general software environments like Google Colab and the use of CNNs, but does not specify particular software libraries or frameworks with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | We train for 800 epochs with batch size ℓ= 100 and no weight decay. We use constant learning rate η = 0.01 for SGD and the classical momentum. In classical momentum, we set the momentum hyperparameter ρ = 0.9. |