U-NO: U-shaped Neural Operators
Authors: Md Ashiqur Rahman, Zachary E Ross, Kamyar Azizzadenesheli
TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We establish our empirical study on Darcy’s flow and the Navier-Stokes equations, two PDEs which have served as benchmarks for the study of neural operator models. We compare the performance of U-NO against the state of art FNO. We empirically show that the advanced structure of U-NO allows for much deeper neural operator models with smaller memory usage. We demonstrate that U-NO achieves average performance improvements of 26% on high resolution simulations of Darcy’s flow equation, and 44% on Navier-Stokes equation, with the best improvement of 51%. |
| Researcher Affiliation | Collaboration | Md Ashiqur Rahman EMAIL Department of Computer Science Purdue University Zachary E. Ross EMAIL Seismological Laboratory California Institute of Technology Kamyar Azizzadenesheli EMAIL NVIDIA Corporation |
| Pseudocode | No | The paper describes the U-NO architecture and its components in detail using prose and mathematical formulations, but it does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code or provide any links to code repositories. |
| Open Datasets | No | The paper describes the process of generating data for Darcy's flow and Navier-Stokes equations (e.g., 'For dataset preparation, we define µ as a pushforward of the Gaussian measure...', 'The diffusion coefficients a(x) are generated according to a µ and we fix f(x) = 1.'), but it does not provide any concrete access information (such as a link, DOI, or citation to a public repository) for these datasets. |
| Dataset Splits | No | From the dataset of N = 2000 simulations of the Darcy flow equation, we set aside 250 simulations for testing and use the rest for training and validation. For each experiment, 10% of the total number of simulations N are set aside for testing, and the rest is used for training and validation. The split between training and validation sets from the 'rest' is not specified. |
| Hardware Specification | Yes | All the computations are carried on a single Nvidia GPU with 24GB memory. |
| Software Dependencies | No | The paper mentions using 'Adam optimizer (Kingma & Ba, 2014)' and 'GELU (Hendrycks & Gimpel, 2016) activation function' but does not provide specific version numbers for these or any other software libraries or frameworks (e.g., Python, PyTorch, TensorFlow). |
| Experiment Setup | Yes | For the experiments, we use Adam optimizer (Kingma & Ba, 2014) and the initial learning rate is scaled down by a factor of 0.5 every 100 epochs. As the non-linearity, we have used the GELU (Hendrycks & Gimpel, 2016) activation function. We train for 700 epochs and save the best performing model on the validation set for evaluation. |