Mechanistic PDE Networks for Discovery of Governing Equations

Authors: Adeel Pervez, Efstratios Gavves, Francesco Locatello

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate PDE discovery on a number of PDEs in two and three dimensions (including time), including complex dynamical data from the 2D Navier Stokes and reaction diffusion PDEs. We examine the robustness of the method to noise. We also consider PDE parameter discovery in the case where the PDE cannot be expressed as a linear combination of fixed basis functions. Table 1 lists the PDEs that we use to demonstrate our method, including their general form.
Researcher Affiliation Academia 1Institute of Science and Technology, Klosterneuburg, Austria 2Informatics Institute, University of Amsterdam, Amsterdam, The Netherlands. Correspondence to: Adeel Pervez <EMAIL>.
Pseudocode Yes Algorithm 1 V-cycle
Open Source Code No Source code will be made available at: https://github.com/ alpz/mech-nn-discovery-pde
Open Datasets No The data was generated by a spectral method following the implementation in de Silva et al. (2020), setting D1 = D2 = 0.1 and ̒̒̒̒ = 1. The data was generated by using the Basilisk solver (Kenneally et al., 2020) starting from a random initial condition and running with a maximum time step size of 0.05.
Dataset Splits No For our model, the training data is mini-batched with a mini-batch size of 8. Each data example in the minibatch is of size 32x32x32, where the first dimension is time.
Hardware Specification No The resulting iterative solver is entirely sparse and batch-parallel on GPU for a batch of PDEs and is useful for higher dimensional PDEs where GPU memory cannot hold the full dense constraint matrices.
Software Dependencies No We compare against a reference finite-difference implementation from the py-pde (Py Pde) package with sinusoidal boundary conditions obtaining L2 errors of orders 10 1, 5 10 2 and 10 2, respectively, showing the errors decreasing with increasing resolution.
Experiment Setup Yes For our model, the training data is mini-batched with a mini-batch size of 8. Each data example in the minibatch is of size 32x32x32, where the first dimension is time. The data is parameterized by 10-layer 2D Res Nets, where we consider the time dimension as a batch dimension. The final loss is an L1 loss, which we find improves discovery. The final term in the loss is a sparsity term, also L1 for the basis polynomial parameters. We use a weight of 10 4 for the sparsity term. The model is trained with Adam with a learning rate of 10 5, which we fix for all experiments.