UPS: Efficiently Building Foundation Models for PDE Solving via Cross-Modal Adaptation
Authors: Junhong Shen, Tanya Marwah, Ameet Talwalkar
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | UPS achieves state-of-the-art results on a wide range of 1D and 2D PDE families from PDEBench, outperforming existing unified models using 4 times less data and 26 times less compute. Meanwhile, it is capable of few-shot transfer to unseen PDE families and coefficients. |
| Researcher Affiliation | Academia | Junhong Shen EMAIL Machine Learning Department, Carnegie Mellon University Tanya Marwah EMAIL Machine Learning Department, Carnegie Mellon University Ameet Talwalkar EMAIL Machine Learning Department, Carnegie Mellon University |
| Pseudocode | No | The paper describes the methodology using text and diagrams (Figure 1) and mathematical equations, but it does not include a formal pseudocode block or algorithm steps formatted like code. |
| Open Source Code | Yes | Code is available at https://github.com/sjunhongshen/Unified PDESolvers. |
| Open Datasets | Yes | We train and evaluate our method using PDEBench (Takamoto et al., 2022). For training, we combine 7 datasets from different PDE families: Burgers Equation (1D), Advection (1D), Diffusion-Sportion (1D), Shallow-Water (2D), compressible Navier-Stokes (1D and 2D), and incompressible Navier-Stokes (2D). |
| Dataset Splits | Yes | We first study the in-distribution performance of UPS, i.e., we evaluate UPS on the test splits of the datasets that are used to train UPS... The set of coefficients and number of trajectories used per PDE are reported in Appendix Table 5. For full details on the data generation process and the hyperparameters used to generate the PDE dataset, we refer the reader to Takamoto et al. (2022). Table 5: For each PDE family, we select one set of coefficients and use the data for training and testing UPS. Dimension Dataset Coefficients Num Train Trajectories Num Test Trajectories Timesteps Resolution |
| Hardware Specification | Yes | All of our experiments can be run on a single NVIDIA A6000 GPU. |
| Software Dependencies | No | The paper mentions software components like RoBERTa, T5, CLIP, and the Adam optimizer, but does not provide specific version numbers for these or other relevant libraries/frameworks (e.g., PyTorch, TensorFlow). |
| Experiment Setup | Yes | A.2.1 Training Hyperparameters. We use the following training hyperparameters for all of our experiments, unless otherwise specified. Batch size: 32, Gradient accumulation: 1, Gradient clipping: -1, Optimizer: Adam, Learning rate: 5E-5, Weight decay: 1E-5, Training epoch: 20 for stage 1, 100 for stage 2 |