reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Neural Q-learning for solving PDEs

Authors: Samuel N. Cohen, Deqing Jiang, Justin Sirignano

JMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical results are presented in Section 6. 6. Numerical experiments In this section, we present numerical results where we apply our algorithm to solve a family of partial diﬀerential equations. The approximator matches the solution of the diﬀerential equation closely in a relatively short period of training time in these test cases. 6.2.1 1-dimensional case For dimension n = 1, domain Ω= ( 1, 1). The exact solution of this diﬀerential equation is u(x) = 1 γ + c1(e 2γx + e 2γx) where c1 = 1/(γ(e 2γ + e 2γ)). For this subsection, set γ = 0.1. To keep track of the training progress, we monitor the average loss level at time t, that is et R Ω[LQt]2dµ where µ is the Lebesgue measure on Ω, which can be estimated using our sample of evaluation points at each time.
Researcher Affiliation	Academia	Samuel N. Cohen EMAIL Mathematical Institute, University of Oxford Oxford, OX2 6GG, UK Deqing Jiang EMAIL Mathematical Institute, University of Oxford Oxford, OX2 6GG, UK Justin Sirignano EMAIL Mathematical Institute, University of Oxford Oxford, OX2 6GG, UK
Pseudocode	Yes	Algorithm 1: Q-PDE Algorithm Parameters: Hyper-parameters of the single-layer neural network; Domain Ω; PDE operator L; Boundary condition at Ω; Sampling measure µ; Number of Monte Carlo points M; Upper bound of training time T; Initialise: Neural net SN; Auxiliary function η; Approximator QN based on SN and η; Smoothing function ψN; Learning rate scheduler {αN t }t 0; Stopping criteria ϵ; Current time t = 0. while err ϵ and t T do Sample M points in Ωusing µ, {xi}; Compute biased gradient estimator GN M,t using (112); Update neural network parameters via (111); Compute err = 1 M PM i=1 ψN(LQN t (xi))2; Update time t; end Return approximator QN t .
Open Source Code	Yes	The implementation is available at https://github.com/DeqingJ/QPDE.
Open Datasets	No	Section 6.2 discusses a "Test equation: Survival time of a Brownian motion" which has an explicit solution. Algorithm 1, under the step "Sample M points in Ωusing µ, {xi}", indicates that the data used for training is generated rather than being a pre-existing, publicly available dataset. There is no mention of external datasets or links to any data repositories.
Dataset Splits	No	The paper describes sampling M points from the domain for training (Algorithm 1: "Sample M points in Ωusing µ, {xi}"). However, it does not specify any explicit training, validation, or test dataset splits or percentages for these sampled points.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments, such as CPU or GPU models, or memory specifications. It only mentions the use of "Py Torch" for implementation.
Software Dependencies	No	The paper mentions using "Py Torch" and the "ADAM adaptive gradient descent rule" (optimizer) but does not provide specific version numbers for either of these software components.
Experiment Setup	Yes	A.8.1 Table of hyper parameters Dimension Method Layer Units Activation Optimizer Numer of MC samples 1 Q-PDE 1 64 Sigmoid ADAM l MC=1k, u MC=2k 20 Q-PDE 1 256 Sigmoid ADAM l MC=2k, u MC=10k 20 DGM 1 256 Sigmoid ADAM l MC=2k, u MC=10k A.8.2 Initialization of neural networks Parameters of the single-layer net S0 are randomly sampled: ci 0 are i.i.d sampled from uniform distribution U[ 1, 1]; wi 0 and bi 0 are i.i.d sampled from Gaussian distribution N(0, Id) and N(0, 1) where Id is the identity matrix of dimension d. A.8.3 Learning process We use Qt := St η as the approximator, where η(x) := 1 x 2. We apply the built-in ADAM optimizer in our test with initial learning rate l0 = 0.5, and the learning rate decays as lt = l0/(1+t/200). In each step, we sample MC samples for gradient estimate. As a larger number of MC sample points reduces the random error, we linearly increase the number Mt of MC points to be sampled at each step as Mt = round(l MC + (u MC l MC)t/T), where T is the terminal number of training steps.