reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Physics-informed Kernel Learning

Authors: Nathan Doumèche, Francis Bach, Gérard Biau, Claire Boyer

JMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the numerical performance of the PIKL estimator through simulations, both in the context of hybrid modeling and in solving PDEs. In particular, we show that PIKL can outperform physics-informed neural networks in terms of both accuracy and computation time.
Researcher Affiliation	Collaboration	Nathan Doum eche EMAIL Sorbonne University, EDF R&D Francis Bach EMAIL INRIA, Ecole Normale Sup erieure, PSL University G erard Biau EMAIL Sorbonne University, Institut universitaire de France Claire Boyer EMAIL Universit e Paris-Saclay, Institut universitaire de France
Pseudocode	No	The paper describes the construction of the PIKL estimator and its algorithm using mathematical formulations and descriptions (e.g., Section 2: The PIKL Estimator, Section 3: The PIKL Algorithm in Practice), but it does not include a distinct, structured pseudocode or algorithm block.
Open Source Code	Yes	To enhance the reproducibility of our work, we provide a Python package that implements the PIKL estimator, designed to handle any linear PDE prior with constant coefﬁcients in dimensions d = 1 and d = 2. This package is available at https://github.com/NathanDoumeche/numerical_PIML_kernel. Note that this package implements the matrix inversion of the PIKL formula (6) by solving a linear system using the LU decomposition. Of course, any other efﬁcient method to avoid direct matrix inversion could be used instead, such as solving a linear system with the conjugate gradient method.
Open Datasets	No	The paper states: "To compare the PIKL and OLS estimators, we generate data such that Y = f (X) + ε, where X U(Ω), ε N(0, σ2)" and later mentions: "The training data set (Xi, Yi)1 i n is constructed such that...". This indicates that the authors generated their own synthetic datasets for the experiments and did not use or provide access to any pre-existing public datasets.
Dataset Splits	No	The paper describes how training data points are generated and assigned to different boundary conditions (e.g., "The training data set (Xi, Yi)1 i n is constructed such that if 1 i n/4 , then Xi = (0, Ui) and Yi = sin(πUi) + sin(4πUi)/2, if n/4 + 1 i 2 n/4 , then Xi = (Ui, 0) and Yi = 0... "). While it mentions evaluation on a "test set," it does not explicitly provide percentages or counts for training, validation, and test splits from a larger dataset, nor does it refer to standard predefined splits.
Hardware Specification	Yes	The training time for Vanilla PINNs is 7 minutes on an Nvidia L4 GPU (24 GB of RAM, 30.3 tera FLOPs for Float32). ... The training time for the PIKL estimator is 6 seconds on an Nvidia L4 GPU.
Software Dependencies	No	The paper mentions a "Python package" for implementation and discusses "Float32" and "Float64" precision. However, it does not provide specific version numbers for Python, any libraries, or other software dependencies.
Experiment Setup	Yes	To compare the PIKL and OLS estimators, we generate data such that Y = f (X) + ε, where X U(Ω), ε N(0, σ2) with σ = 0.5, and the target function is f = f1 (corresponding to (a1, a2) = (1, 0)). We implement the PIKL algorithm with 601 Fourier modes (m = 300) and s = 2. For the 1d-wave equation, it states: "We train our PIKL method using n = 105 data points and 1681 Fourier modes (i.e., m = 20)." For PINNs, it mentions: "The PINN is a fully-connected neural network with three hidden layers of size 10, using tanh as activation function, and optimized on 2 105 collocation points by 2000 gradient descent steps."