reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Optimization for Neural Operators can Benefit from Width

Authors: Pedro Cisneros-Velarde, Bhavesh Shrimali, Arindam Banerjee

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present empirical results on canonical operator learning problems to support our theoretical results and find that larger widths benefit training. ... Finally, to complement our theoretical results, we present empirical evaluations of DONs and FNOs and show the benefits of width on learning three popular operators in the literature (Li et al., 2021a; Lu et al., 2021): antiderivative, diffusion-reaction, and Burgers equation. Our experiments show that increasing the width leads to lower training losses and generally leads to faster convergence.
Researcher Affiliation	Collaboration	1VMware Research 2University of Illinois Urbana-Champaign.
Pseudocode	No	The paper describes the architectures of DONs and FNOs in text and through mathematical equations and schematics, but does not include any explicit pseudocode blocks or algorithms.
Open Source Code	Yes	The associated code for the experiments in Section 8 and the ones presented in this appendix are found in https://github.com/bhaveshshrimali/neuralop_optimization.
Open Datasets	Yes	We make use of the datasets publicly available at https://github.com/neuraloperator/neuraloperator, specifically the Burgers R10.mat dataset available at https://drive.google.com/drive/folders/1UnbQh2WWc6knEHbLn-ZaXrKUZhp7pjt-.
Dataset Splits	No	The paper mentions generating training data for the Antiderivative and Diffusion-Reaction operators, and a total sample size for the Burgers Equation dataset (2048 input functions). However, it does not provide specific train/test/validation splits (e.g., percentages or counts for each set) or reference standard predefined splits for reproducibility. For example, for Antiderivative: "sample size of the training data is n = 2000", and for Burgers: "comprises of 2048 input functions... We test the trained neural operators on a simple GRF sampled from the training dataset" which doesn't specify the split.
Hardware Specification	Yes	We remark that all experiments with widths m {10, 50} were run on a personal computer with one NVIDIA Quadro GPU, while the rest of widths were on Google Colab with single NVIDIA L4 and A100 GPUs.
Software Dependencies	No	The paper mentions using the Adam optimizer and Scaled Exponential Linear Unit (SELU) as activation functions but does not specify version numbers for any software libraries or frameworks (e.g., Python, PyTorch, TensorFlow, or specific Adam/SELU implementations).
Experiment Setup	Yes	We monitor the training process over 80,000 training epochs... We fix the learning rate for the Adam optimizer at 10-3 and with full-batch training, i.e., the batch size of 2000 for both DONs and FNOs. ... For all the experiments we use a constant learning rate of 3e-4 and Adam optimizer with a batch size of 4000. ... For all the experiments we use a constant learning rate of 10-3 and Adam optimizer with a batch size of 800.