reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Active Learning for Neural PDE Solvers

Authors: Daniel Musekamp, Marimuthu Kalimuthu, David Holzmüller, Makoto Takamoto, Mathias Niepert

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We use the benchmark to evaluate batch active learning algorithms such as uncertainty and feature-based methods. We show that AL reduces the average error by up to 71% compared to random sampling and significantly reduces worst-case errors. Moreover, AL generates similar datasets across repeated runs, with consistent distributions over the PDE parameters and initial conditions. The acquired datasets are reusable, providing benefits for surrogate models not involved in the data generation. [...] Using the benchmark, we conducted several experiments exploring the behavior of AL algorithms for PDE solving. These experiments show that AL can increase data efficiency and especially reduce worst-case errors.
Researcher Affiliation	Collaboration	1University of Stuttgart 2Sim Tech 3 IMPRS-IS 4INRIA Paris, Ecole Normale Supérieure, PSL University 5NEC Labs Europe
Pseudocode	Yes	Listing 1 shows the pseudocode of the (random) data generation pipeline. [...] Listing 2 shows the interface for the Model and Prob Model classes.
Open Source Code	Yes	The code is available at https://github.com/dmusekamp/al4pde.
Open Datasets	Yes	The ground truth data is generated using a numerical solver, which can be defined as a forward operator G : U Rl U... The inputs to the initial value problem are drawn from the test input distribution p T... We use the FDM-based JAX simulator and the initial condition generator from PDEBench (Takamoto et al., 2022).
Dataset Splits	Yes	The test set consists of 2048 trajectories simulated with random inputs drawn from p T (ψ). [...] The initial training data consists of 64 trajectories, whereas the validation and test sets each have 512 trajectories.
Hardware Specification	Yes	The experiments were performed on NVIDIA Ge Force RTX 4090 GPUs (one per experiment), except for the 3D CNS case, which was performed on a single 96 GB H100 GPU.
Software Dependencies	No	The paper mentions 'JAX simulator' for Burgers and CNS, and references various models like U-Net, FNO, and Sine Net. However, it does not provide specific version numbers for these software components or other key dependencies like machine learning frameworks (e.g., PyTorch, TensorFlow) or Python versions, which are necessary for full reproducibility.
Experiment Setup	Yes	The training is performed for 500 epochs with a cosine schedule, which reduces the learning rate from 10-3 to 10-5. The batch size is set to 512 (2D CNS: 64). We use an exponential data schedule, i.e., in each AL iteration, the amount of data added is equal to the current training set size (Kirsch et al., 2023). For 1D equations, we start with 256 trajectories. The pool size is fixed to 100,000 candidates (3D: 30000). The uncertainty is estimated using two ensemble members [...]. Activation GELU (Hendrycks & Gimpel, 2016) Conditioning Fourier (Vaswani et al., 2017) Channel multiplier [1, 2, 2, 4] Hidden Channels 16.