Active Learning for Neural PDE Solvers

Authors: Daniel Musekamp, Marimuthu Kalimuthu, David Holzmüller, Makoto Takamoto, Mathias Niepert

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We use the benchmark to evaluate batch active learning algorithms such as uncertainty and feature-based methods. We show that AL reduces the average error by up to 71% compared to random sampling and significantly reduces worst-case errors. Moreover, AL generates similar datasets across repeated runs, with consistent distributions over the PDE parameters and initial conditions. The acquired datasets are reusable, providing benefits for surrogate models not involved in the data generation. [...] Using the benchmark, we conducted several experiments exploring the behavior of AL algorithms for PDE solving. These experiments show that AL can increase data efficiency and especially reduce worst-case errors.
Researcher Affiliation Collaboration 1University of Stuttgart 2Sim Tech 3 IMPRS-IS 4INRIA Paris, Ecole Normale Supérieure, PSL University 5NEC Labs Europe
Pseudocode Yes Listing 1 shows the pseudocode of the (random) data generation pipeline. [...] Listing 2 shows the interface for the Model and Prob Model classes.
Open Source Code Yes The code is available at https://github.com/dmusekamp/al4pde.
Open Datasets Yes The ground truth data is generated using a numerical solver, which can be defined as a forward operator G : U Rl U... The inputs to the initial value problem are drawn from the test input distribution p T... We use the FDM-based JAX simulator and the initial condition generator from PDEBench (Takamoto et al., 2022).
Dataset Splits Yes The test set consists of 2048 trajectories simulated with random inputs drawn from p T (ψ). [...] The initial training data consists of 64 trajectories, whereas the validation and test sets each have 512 trajectories.
Hardware Specification Yes The experiments were performed on NVIDIA Ge Force RTX 4090 GPUs (one per experiment), except for the 3D CNS case, which was performed on a single 96 GB H100 GPU.
Software Dependencies No The paper mentions 'JAX simulator' for Burgers and CNS, and references various models like U-Net, FNO, and Sine Net. However, it does not provide specific version numbers for these software components or other key dependencies like machine learning frameworks (e.g., PyTorch, TensorFlow) or Python versions, which are necessary for full reproducibility.
Experiment Setup Yes The training is performed for 500 epochs with a cosine schedule, which reduces the learning rate from 10-3 to 10-5. The batch size is set to 512 (2D CNS: 64). We use an exponential data schedule, i.e., in each AL iteration, the amount of data added is equal to the current training set size (Kirsch et al., 2023). For 1D equations, we start with 256 trajectories. The pool size is fixed to 100,000 candidates (3D: 30000). The uncertainty is estimated using two ensemble members [...]. Activation GELU (Hendrycks & Gimpel, 2016) Conditioning Fourier (Vaswani et al., 2017) Channel multiplier [1, 2, 2, 4] Hidden Channels 16.