Solving Differential Equations with Constrained Learning

Authors: Viggo Moro, Luiz Chamon

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 EXPERIMENTS In this section, we showcase the use of SCL by training MLPs and FNOs (Li et al., 2021) to solve six PDEs (convection, reaction-diffusion, eikonal, Burgers , diffusion-sorption, and Navier Stokes). We consider different subsets of constraints from (SCL)/(SCL ) to illustrate a variety of knowledge settings, but train only the most suitable model in each case since our goal is to illustrate the natural uses of SCL rather than exhaust its potential. Detailed descriptions are provided in the appendices, including BVPs (App. A), training procedures (App. E), and further results (App. F). Code to reproduce these experiments is available at https://github.com/vmoro1/scl. In the sequel, we use fixed points (x, t) for the BC objective rather than ψBC 0 to illustrate how computational complexity can be reduced without significantly affecting the results. We still use ψBC 0 for π.
Researcher Affiliation Academia Viggo Moro University of Oxford EMAIL Luiz F. O. Chamon École polytechnique EMAIL
Pseudocode Yes Algorithm 1 Primal-dual method for (SCL) 1: Inputs: Differential operator Dπ, invariant transformations γi G, observations set (πj, τj, u j), parameterized model uθ0, and λpde 0 = λsi 0 = λoj 0 = 0 2: for k = 1, . . . , K 3: ℓbc k = 1 NBC uθk(πbc n , tbc n ) h(xbc n , tbc n ) 2 , (xbc n , tbc n , πbc n ) ψBC 0 4: ℓpde k = 1 Npde Dπpde n uθk(πpde n ) (xpde n , tpde n ) τ(xpde n , tpde n ) 2 , (xpde n , tpde n , πpde n ) ψPDE 0 5: ℓsi k = 1 uθk(πsi n )(xsi n, tsi n) uθk(πsi n ) γi(πsi n )(xsi n, tsi n) 2 , (xsin, tsin, πsi n ) ψSTi 0 6: ℓoj k = 1 uθk(πj, τj)(xoj n , toj n ) u j(xoj n , toj n ) 2 , (xoj n , toj n ) m 7: θk+1 = θk ηp θℓbc k + λpde θℓpde k + i=1 λsi k θℓsi k + j=1 λoj k θℓoj k 8: λpde k+1 = λpde k +ηd(ℓpde k ϵpde) +; λs k+1 = λs k +ηd(ℓs k ϵs) +; λoj k+1 = λoj k +ηd(ℓoj k ϵo) +
Open Source Code Yes Code to reproduce these experiments is available at https://github.com/vmoro1/scl.
Open Datasets Yes The datasets from (Li et al., 2021) were used for Burgers and Navier-Stokes equation, whereas the diffusion-sorption dataset was taken from (Takamoto et al., 2022).
Dataset Splits Yes Table 6: Problem hyperparameters for supervised solutions ϵo # training samples # validation samples # test samples FNO architecture Burgers 10 3 800 200 200 16 modes, 4 layers Diffusion-sorption 10 3 1000 500 500 8 modes, 5 layers Navier-Stokes: ν = 10 3 10 2 1000 500 500 8 modes, 8 layers Navier-Stokes: ν = 10 4 5 10 2 1000 500 500 8 modes, 8 layers Navier-Stokes: ν = 10 5 10 2 800 200 200 8 modes, 8 layers
Hardware Specification Yes It was performed in part on the Hore Ka supercomputer funded by the Ministry of Science, Research and the Arts Baden Württemberg and by the Federal Ministry of Education and Research.
Software Dependencies No All methods were trained using Adam with the default parameters from (Kingma & Ba, 2017) and learning rates described in Table 4. All models were trained for 500 epochs using Adam with the default settings from (Kingma & Ba, 2017) with learning rate 10 3 and batch size of 20. The paper refers to an optimization algorithm (Adam) and its parameters as described in a cited work, but does not provide specific software library names or their version numbers (e.g., PyTorch, TensorFlow, SciPy versions).
Experiment Setup Yes E.1 HYPERPARAMETERS AND IMPLEMENTATION DETAILS Throughout our experiments, we use the relative L2 error as a performance metric, which we define as erel(π, h) = PN n=1 uθ(π, h)(xn, tn) u (π, h)(xn, tn) 2 PN n=1 u (π, h)(xn, tn) 2 , (32) where u is the solution of (BVP) obtained either analytically or by using classical numerical methods. For MLPs, the collocation points {(xn, tn)} are taken from a dense regular grid of points (see exact numbers below), and for FNOs, they are determined by the test sets from (Li et al., 2021; Takamoto et al., 2022). For parametrized problems, we report the average error j=1 erel(πj, hj), evaluated either on a dense regular grid of points (for coefficients π, see exact numbers below) or based on the test sets from (Li et al., 2021; Takamoto et al., 2022). (Subsections E.1.1, E.1.2, E.1.3, E.1.4 detailing 'Training' contain specific hyperparameters like learning rates, epochs, batch sizes, optimizer names (Adam), and other configuration settings.)