UPS: Efficiently Building Foundation Models for PDE Solving via Cross-Modal Adaptation

Authors: Junhong Shen, Tanya Marwah, Ameet Talwalkar

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental UPS achieves state-of-the-art results on a wide range of 1D and 2D PDE families from PDEBench, outperforming existing unified models using 4 times less data and 26 times less compute. Meanwhile, it is capable of few-shot transfer to unseen PDE families and coefficients.
Researcher Affiliation Academia Junhong Shen EMAIL Machine Learning Department, Carnegie Mellon University Tanya Marwah EMAIL Machine Learning Department, Carnegie Mellon University Ameet Talwalkar EMAIL Machine Learning Department, Carnegie Mellon University
Pseudocode No The paper describes the methodology using text and diagrams (Figure 1) and mathematical equations, but it does not include a formal pseudocode block or algorithm steps formatted like code.
Open Source Code Yes Code is available at https://github.com/sjunhongshen/Unified PDESolvers.
Open Datasets Yes We train and evaluate our method using PDEBench (Takamoto et al., 2022). For training, we combine 7 datasets from different PDE families: Burgers Equation (1D), Advection (1D), Diffusion-Sportion (1D), Shallow-Water (2D), compressible Navier-Stokes (1D and 2D), and incompressible Navier-Stokes (2D).
Dataset Splits Yes We first study the in-distribution performance of UPS, i.e., we evaluate UPS on the test splits of the datasets that are used to train UPS... The set of coefficients and number of trajectories used per PDE are reported in Appendix Table 5. For full details on the data generation process and the hyperparameters used to generate the PDE dataset, we refer the reader to Takamoto et al. (2022). Table 5: For each PDE family, we select one set of coefficients and use the data for training and testing UPS. Dimension Dataset Coefficients Num Train Trajectories Num Test Trajectories Timesteps Resolution
Hardware Specification Yes All of our experiments can be run on a single NVIDIA A6000 GPU.
Software Dependencies No The paper mentions software components like RoBERTa, T5, CLIP, and the Adam optimizer, but does not provide specific version numbers for these or other relevant libraries/frameworks (e.g., PyTorch, TensorFlow).
Experiment Setup Yes A.2.1 Training Hyperparameters. We use the following training hyperparameters for all of our experiments, unless otherwise specified. Batch size: 32, Gradient accumulation: 1, Gradient clipping: -1, Optimizer: Adam, Learning rate: 5E-5, Weight decay: 1E-5, Training epoch: 20 for stage 1, 100 for stage 2