reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

A Continuous-time Stochastic Gradient Descent Method for Continuous Data

Authors: Kexin Jin, Jonas Latz, Chenguang Liu, Carola-Bibiane Schönlieb

JMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In numerical experiments, we show the suitability of our stochastic gradient process for (convex) polynomial regression with continuous data and the (non-convex) training of physics-informed neural networks with continuous sampling of function-valued data. We end with illustrating the applicability of the stochastic gradient process in a polynomial regression problem with noisy functional data, as well as in a physics-informed neural network. We now study two fields of application of the stochastic gradient process for continous data. In the first example, we consider regularized polynomial regression with noisy functional data. In the second example, we study so-called physics-informed neural networks.
Researcher Affiliation	Academia	Kexin Jin EMAIL Department of Mathematics Princeton University Princeton, NJ 08544-1000, USA Jonas Latz EMAIL Department of Mathematics The University of Manchester Manchester, M13 9PL, United Kingdom Chenguang Liu EMAIL Delft Institute of Applied Mathematics Technische Universiteit Delft Delft, 2628 CD, The Netherlands Carola-Bibiane Sch onlieb EMAIL Department of Applied Mathematics and Theoretical Physics University of Cambridge Cambridge, CB3 0WA, United Kingdom
Pseudocode	Yes	Algorithm 1 Discretized Markov pure jump process Algorithm 2 Discretized Reflected Brownian motion on S
Open Source Code	No	The paper does not provide an explicit statement or link to their own source code for the methodology described. It mentions using existing packages and code from other works, but not their own implementation.
Open Datasets	No	The paper uses 'artificial data g' for polynomial regression and defines a '1D Transport equation' with a known analytical solution, indicating synthetic data generation rather than the use of pre-existing public datasets. No concrete access information for any dataset is provided.
Dataset Splits	Yes	From the interior of the domain of time and space variables, i.e. (0, 1) (0, 1), we use Algorithm 2 with σ = 0.5 to sample the train set of size 3 104 for SGPC and SGPD and we uniformly sample 600 points for the train set of SGD. In addition, as a part of the train set for all three methods, we sample uniformly 20 and 60 points for the initial condition and periodic boundary condition, respectively. The learning rate for SGD and SGPC is 0.01. ... We evaluate the models by testing on a uniformly sampled test set of size 2 103 and compare the predicted values with the theoretical solution u(t, x) = sin(2π(x t)).
Hardware Specification	Yes	We the train networks on Google Colab Pro using GPUs (often T4 and P100, sometimes K80).
Software Dependencies	No	Integrated Py Torch-based packages are available for example see Chen et al. (2020); Pedro et al. (2019). The paper mentions 'PyTorch' but does not specify a version number for it or any other software used in their own implementation.
Experiment Setup	Yes	In our experiments, we choose h = 0.1. We use Algorithms 1 and 2 to discretize the index processes with constant stepsize t( ) t( 1) = 10 2. We perform J := 100 repeated runs for each of the considered settings for N := 5 104 time steps and thus, obtain a family of trajectories (θ(j,n))n=1,...,N,j=1,...,J. In each case, we choose the initial values V (0) := 0 and the θ(j,0) := (0.5, . . . , 0.5). For our estimation, we set α := 10 4 and use the K = 9 Legendre polynomials with degrees 0, . . . , 8. The learning rate for SGD and SGPC is 0.01. The learning rate for SGPD is deﬁned as η(t) = 0.01 log(t + 2)0.3 , which is chosen such that the associated µ := 1/η satisﬁes Assumption 4. For all three methods, we use Adam (see Kingma and Ba, 2015) as the optimizer to speed up the convergence; we use an L2 regularizer with weight 0.1 to avoid overﬁtting. Each model is trained over 600 iterations with batch size 50. The training process for SGPC and SGPD contains only one epoch, while we train 50 epochs in the SGD case.