Ensemble and Mixture-of-Experts DeepONets For Operator Learning
Authors: Ramansh Sharma, Varun Shankar
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We then demonstrate that ensemble Deep ONets containing a trunk ensemble of a standard trunk, the Po U-Mo E trunk, and/or a proper orthogonal decomposition (POD) trunk can achieve 2-4x lower relative ℓ2 errors than standard Deep ONets and POD-Deep ONets on both standard and challenging new operator learning problems involving partial differential equations (PDEs) in two and three dimensions. |
| Researcher Affiliation | Academia | Ramansh Sharma Kahlert School of Computing, University of Utah, UT, USA EMAIL Varun Shankar Kahlert School of Computing, University of Utah, UT, USA EMAIL |
| Pseudocode | No | The paper describes the methodologies and models using mathematical formulations and textual descriptions, but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not explicitly state that the code is open-source or provide a link to a code repository. It mentions plans for future work regarding parallelization strategy, but not for current code release. |
| Open Datasets | Yes | We focused on the steady state problem and used the dataset specified in Lu et al. (2022, Section 5.7, Case A). |
| Dataset Splits | Yes | We solved the PDE numerically at Ny = 2207 collocation points using a fourth-order accurate RBF-FD method (Shankar & Fogelson, 2018; Shankar et al., 2021); using this solver, we generated 1000 training and 200 test input and output function pairs. |
| Hardware Specification | Yes | We trained all models for 150,000 epochs on an NVIDIA GTX 4080 GPU. |
| Software Dependencies | No | The paper mentions the use of Adam and AdamW optimizers (Kingma & Ba, 2017; Loshchilov & Hutter, 2018) but does not provide specific version numbers for any software libraries or frameworks used (e.g., Python, PyTorch, TensorFlow). |
| Experiment Setup | Yes | We trained all models for 150,000 epochs on an NVIDIA GTX 4080 GPU. All results were calculated over five random seeds. We annealed the learning rates with an inverse-time decay schedule. We used the Adam optimizer (Kingma & Ba, 2017) for training on the Darcy flow and the cavity flow problems, and the Adam W optimizer (Loshchilov & Hutter, 2018) on the 2D and 3D reaction-diffusion problems. Other Deep ONet hyperparameters and the network architectures are listed in Appendix E. Appendix E contains Tables 5 and 6 detailing network architectures and 'p' values. |