Ensemble and Mixture-of-Experts DeepONets For Operator Learning

Authors: Ramansh Sharma, Varun Shankar

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We then demonstrate that ensemble Deep ONets containing a trunk ensemble of a standard trunk, the Po U-Mo E trunk, and/or a proper orthogonal decomposition (POD) trunk can achieve 2-4x lower relative ℓ2 errors than standard Deep ONets and POD-Deep ONets on both standard and challenging new operator learning problems involving partial differential equations (PDEs) in two and three dimensions.
Researcher Affiliation Academia Ramansh Sharma Kahlert School of Computing, University of Utah, UT, USA EMAIL Varun Shankar Kahlert School of Computing, University of Utah, UT, USA EMAIL
Pseudocode No The paper describes the methodologies and models using mathematical formulations and textual descriptions, but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not explicitly state that the code is open-source or provide a link to a code repository. It mentions plans for future work regarding parallelization strategy, but not for current code release.
Open Datasets Yes We focused on the steady state problem and used the dataset specified in Lu et al. (2022, Section 5.7, Case A).
Dataset Splits Yes We solved the PDE numerically at Ny = 2207 collocation points using a fourth-order accurate RBF-FD method (Shankar & Fogelson, 2018; Shankar et al., 2021); using this solver, we generated 1000 training and 200 test input and output function pairs.
Hardware Specification Yes We trained all models for 150,000 epochs on an NVIDIA GTX 4080 GPU.
Software Dependencies No The paper mentions the use of Adam and AdamW optimizers (Kingma & Ba, 2017; Loshchilov & Hutter, 2018) but does not provide specific version numbers for any software libraries or frameworks used (e.g., Python, PyTorch, TensorFlow).
Experiment Setup Yes We trained all models for 150,000 epochs on an NVIDIA GTX 4080 GPU. All results were calculated over five random seeds. We annealed the learning rates with an inverse-time decay schedule. We used the Adam optimizer (Kingma & Ba, 2017) for training on the Darcy flow and the cavity flow problems, and the Adam W optimizer (Loshchilov & Hutter, 2018) on the 2D and 3D reaction-diffusion problems. Other Deep ONet hyperparameters and the network architectures are listed in Appendix E. Appendix E contains Tables 5 and 6 detailing network architectures and 'p' values.