Pretraining a Neural Operator in Lower Dimensions
Authors: AmirPouya Hemmasian, Amir Barati Farimani
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluated the effectiveness of this pretraining strategy in similar PDEs in higher dimensions. We use the Factorized Fourier Neural Operator (FFNO) due to having the necessary flexibility to be applied to PDE data of arbitrary spatial dimensions and reuse trained parameters in lower dimensions. In addition, our work sheds light on the effect of the fine-tuning configuration to make the most of this pretraining strategy. |
| Researcher Affiliation | Academia | Amir Pouya Hemmasian EMAIL Department of Mechanical Engineering Carnegie Mellon University Amir Barati Farimani EMAIL Department of Mechanical Engineering Carnegie Mellon University |
| Pseudocode | Yes | Algorithm 1 Pre Low Ding a neural PDE solver |
| Open Source Code | Yes | Code is available at https://github.com/Barati Lab/Pre Low D. |
| Open Datasets | Yes | The 1D datasets are provided by PDEBench (Takamoto et al., 2022), and we generated the 2D datasets using the exact solution function of u(x, y, t) = u0(x βt, y βt). |
| Dataset Splits | Yes | For each dimensionality and each value of c, 10000 samples were generated, 2000 of which are held for validation. |
| Hardware Specification | Yes | On our Ge Force RTX 2080 Ti Nvidia GPUs, with a batch size of 64 and 5000 optimization iterations for each stage, the pretraining takes about 3.35 minutes while the fine-tuning or training for the 2D task takes about 21 minutes. |
| Software Dependencies | No | The paper mentions 'Adam W optimizer' but does not specify version numbers for any software libraries or packages used, such as Python, PyTorch, TensorFlow, etc. This is insufficient to reproduce software dependencies. |
| Experiment Setup | Yes | Each training stage consists of 5000 iterations of the Adam W optimizer with an initial learning rate of 0.001. The learning rate is multiplied by 0.2 when the loss reaches a plateau and does not improve for more than 100 iterations. Our choice of architecture for FFNO has 4 hidden layers with a latent dimension of 128 and 16 Fourier modes in each axis.On our Ge Force RTX 2080 Ti Nvidia GPUs, with a batch size of 64 and 5000 optimization iterations for each stage, the pretraining takes about 3.35 minutes while the fine-tuning or training for the 2D task takes about 21 minutes. |