Learning Operators with Coupled Attention

Authors: Georgios Kissas, Jacob H. Seidman, Leonardo Ferreira Guilhoto, Victor M. Preciado, George J. Pappas, Paris Perdikaris

JMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we evaluate the performance of LOCA on several operator learning scenarios involving systems governed by ordinary and partial differential equations, as well as a black-box climate prediction problem. Through these scenarios we demonstrate state of the art accuracy, robustness with respect to noisy input data, and a consistently small spread of errors over testing data sets, even for out-of-distribution prediction tasks.
Researcher Affiliation Academia Georgios Kissas EMAIL Department of Mechanical Engineering and Applied Mechanics University of Pennsylvania Philadelphia, PA 19104
Pseudocode Yes Algorithm 1 provides an overview of the steps required for implementing the LOCA method.
Open Source Code Yes All code and data accompanying this manuscript will be made publicly available at https://github.com/Predictive Intelligence Lab/LOCA.
Open Datasets Yes Mechanical MNIST database (Lejeune, 2020). Physical Sciences Laboratory meteorological data (Kalnay et al., 1996)
Dataset Splits Yes Out of the 70,000 realizations that the MNIST data set contains, 60,000 are used for training and 10,000 are used for testing, therefore Ntrain = 60,000 and Ntest = 10,000. For Climate modeling: Ntrain = 1825 (excluding the days for leap years). We consider a test data set consisting of the daily surface air temperature and pressure data from the years 2005 to 2010, meaning Ntest = 1825 (excluding leap years), on an 72 × 72 grid also.
Hardware Specification Yes All the models are trained on a single NVIDIA RTX A6000 GPU.
Software Dependencies No We also thank the developers of the software that enabled our research, including JAX (Bradbury et al., 2018), Kymatio (Andreux et al., 2020), Matplotlib (Hunter, 2007), Pytorch (Paszke et al., 2019) and NumPy (Harris et al., 2020).
Experiment Setup Yes For LOCA and DON we set the batch size to be 100, initial learning rate equal to lr = 0.001, and an exponential learning rate decay with a decay-rate of 0.95 every 100 training iterations. For the FNO training, we set the batch size to be 100 and consider a learning rate lr = 0.001, which we then reduce by 0.5 every 100 epochs and a weight decay of 0.0001. Moreover, for the FNO method we use the Re LU activation function. All networks are trained via mini-batch stochastic gradient descent using the Adam optimizer with default settings (Kingma and Ba, 2014).