reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

On the Benefits of Active Data Collection in Operator Learning

Authors: Unique Subedi, Ambuj Tewari

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we conduct numerical studies comparing our active data collection strategy with passive data collection (random sampling) for learning solution operators for the Poisson and Heat Equations. Figures (1) and (2) show the testing error as a function of the training sample size.
Researcher Affiliation	Academia	1Department of Statistics, University of Michigan, Ann Arbor, USA. Correspondence to: Unique Subedi <EMAIL>.
Pseudocode	No	The paper describes the data collection strategy and estimator in Section 3.1 and Appendix A.1 but does not present it in a structured pseudocode or algorithm block.
Open Source Code	Yes	Our code is available at https://github.com/ unique-subedi/active-operator-learning.
Open Datasets	No	For the passive data collection strategy, the input functions f are independently sampled as f GP(0, 502(∇^2+I)^-2), where GP denotes Gaussian Process... For each initial condition, we use the finite difference method with forward-time discretization to compute the solution u1 at t = 1. This is not a publicly available dataset with a link or citation, but a description of how synthetic data is generated.
Dataset Splits	Yes	For testing, 100 additional source functions f GP(0, 502(∇^2 + I)^-2) are generated... All estimators are evaluated on a test set of size 100, drawn from the same distribution as the training data.
Hardware Specification	No	The paper does not explicitly describe the hardware used to run its experiments, mentioning only the grid size for computations (e.g., '64x64 grid') but no specific processor or GPU models.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers. It mentions using 'Fourier Neural Operator (FNO)' and 'finite-difference method' but without version details.
Experiment Setup	Yes	The FNO model has four Fourier layers and N/2 Fourier modes, where N denotes the number of grid points along each spatial dimension. In our experiments, all computations are carried out on a 64x64 grid, so N = 64... This is done using 1000 time discretization steps on a 64x64 grid. For our experiments, we set τ = 10^-2.