U-NO: U-shaped Neural Operators

Authors: Md Ashiqur Rahman, Zachary E Ross, Kamyar Azizzadenesheli

TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We establish our empirical study on Darcy’s flow and the Navier-Stokes equations, two PDEs which have served as benchmarks for the study of neural operator models. We compare the performance of U-NO against the state of art FNO. We empirically show that the advanced structure of U-NO allows for much deeper neural operator models with smaller memory usage. We demonstrate that U-NO achieves average performance improvements of 26% on high resolution simulations of Darcy’s flow equation, and 44% on Navier-Stokes equation, with the best improvement of 51%.
Researcher Affiliation Collaboration Md Ashiqur Rahman EMAIL Department of Computer Science Purdue University Zachary E. Ross EMAIL Seismological Laboratory California Institute of Technology Kamyar Azizzadenesheli EMAIL NVIDIA Corporation
Pseudocode No The paper describes the U-NO architecture and its components in detail using prose and mathematical formulations, but it does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not contain any explicit statements about releasing source code or provide any links to code repositories.
Open Datasets No The paper describes the process of generating data for Darcy's flow and Navier-Stokes equations (e.g., 'For dataset preparation, we define µ as a pushforward of the Gaussian measure...', 'The diffusion coefficients a(x) are generated according to a µ and we fix f(x) = 1.'), but it does not provide any concrete access information (such as a link, DOI, or citation to a public repository) for these datasets.
Dataset Splits No From the dataset of N = 2000 simulations of the Darcy flow equation, we set aside 250 simulations for testing and use the rest for training and validation. For each experiment, 10% of the total number of simulations N are set aside for testing, and the rest is used for training and validation. The split between training and validation sets from the 'rest' is not specified.
Hardware Specification Yes All the computations are carried on a single Nvidia GPU with 24GB memory.
Software Dependencies No The paper mentions using 'Adam optimizer (Kingma & Ba, 2014)' and 'GELU (Hendrycks & Gimpel, 2016) activation function' but does not provide specific version numbers for these or any other software libraries or frameworks (e.g., Python, PyTorch, TensorFlow).
Experiment Setup Yes For the experiments, we use Adam optimizer (Kingma & Ba, 2014) and the initial learning rate is scaled down by a factor of 0.5 every 100 epochs. As the non-linearity, we have used the GELU (Hendrycks & Gimpel, 2016) activation function. We train for 700 epochs and save the best performing model on the validation set for evaluation.