Separable Operator Networks

Authors: Xinling Yu, Sean Hooten, Ziyue Liu, Yequan Zhao, Marco Fiorentino, Thomas Van Vaerenbergh, Zheng Zhang

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This section presents comprehensive numerical studies demonstrating the expressive power and effectiveness of Sep ONet compared to PI-Deep ONet on various time-dependent PDEs: diffusion-reaction, advection, Burgers , and (2+1)-dimensional nonlinear diffusion equations. Both models were trained by optimizing the physics loss (equation (4)) on a dataset D consisting of input functions, residual points, initial points, and boundary points. PDE definitions are summarized in Table 2. We evaluate both models by varying the number of input functions (Nf) and training points (Nc) across four key perspectives: test accuracy, GPU memory usage, training time, and extreme-scale learning capabilities. The main results are illustrated in Figure 2 and Figure 3, with complete test results reported in Appendix B.3.
Researcher Affiliation Collaboration Xinling Yu EMAIL University of California, Santa Barbara; Sean Hooten EMAIL Hewlett Packard Labs, Hewlett Packard Enterprise
Pseudocode No The paper describes the architecture of Sep ONet in Figure 1 and provides mathematical formulations, but it does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes Open source code is available at https://github.com/Hewlett Packard/separable-operator-networks.
Open Datasets No The input training source terms are sampled from a mean-zero Gaussian random field (GRF) (Seeger, 2004) with a length scale 0.2. To generate the test dataset, we sample 100 different source terms from the same GRF and apply a second-order implicit finite difference method (Iserles, 2009) to obtain the reference solutions on a uniform 128 128 grid.
Dataset Splits Yes We set the number of residual points to Nr = N d = Nc, where d is the problem dimension and N is an integer. Here, Nc refers to the total number of training points. ... The number of initial and boundary points per axis is set to NI = Nb = N = d Nc, and these points are also randomly sampled from the solution domain. ... We evaluate both models by varying the number of input functions (Nf) and training points (Nc).
Hardware Specification Yes The code in this study is implemented using JAX and Equinox libraries (Bradbury et al., 2018; Kidger & Garcia, 2021), and all training was performed on a single NVIDIA A100 GPU with 80 GB of memory.
Software Dependencies No The code in this study is implemented using JAX and Equinox libraries (Bradbury et al., 2018; Kidger & Garcia, 2021). No specific version numbers for JAX or Equinox are provided.
Experiment Setup Yes Both PI-Deep ONet and Sep ONet were trained by minimizing the physics loss (equation(4)) using gradient descent with the Adam optimizer (Kingma & Ba, 2014). The initial learning rate is 1 10 3 and decays by a factor of 0.9 every 1,000 iterations. Additionally, we resample input training functions and training points (including residual, initial, and boundary points) every 100 iterations. ... Training hyperparameters are provided in Table 5.