Towards Multi-spatiotemporal-scale Generalized PDE Modeling
Authors: Jayesh K Gupta, Johannes Brandstetter
TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this work, we make such comprehensive comparisons regarding performance, runtime complexity, memory requirements, and generalization capabilities. Concretely, we stress-test various FNO, (Dilated) Res Net, and U-Net like approaches to fluid mechanics problems in both vorticity-stream and velocity function form. Figure 1: Example rollout trajectories of the best-performing U-Net model... 4 Experiments We establish the following set of desiderata for our benchmarks... Table 1: Comparison of parameter count, runtime, and memory requirement of the tested architectures... Figure 4: One-step errors for modeling different PDEs, shown for different number of training trajectories. |
| Researcher Affiliation | Industry | Jayesh K. Gupta EMAIL Microsoft Autonomous Systems and Robotics Research Johannes Brandstetter EMAIL Microsoft Research AI4Science |
| Pseudocode | No | The paper describes methods and procedures in paragraph text and refers to existing architectures, but does not contain explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Source code for our PyTorch benchmark framework is available at https://github.com/microsoft/pdearena. |
| Open Datasets | No | We modified the implementation in Speedy Weather.jl (Klöwer et al., 2022), obtaining data on a grid with spatial resolution of 192 × 96 (∆x = 1.875◦, ∆y = 3.75◦), and temporal resolution of ∆t = 48 h. We obtained data on a grid with spatial resolution of 128 × 128 (∆x = 0.25, ∆y = 0.25), and temporal resolution of ∆t = 1.5 s using ΦFlow (Holl et al., 2020). The paper describes the sources and methods used to *generate* the data (Speedy Weather.jl, ΦFlow) but does not provide direct access information (e.g., specific links, DOIs, or repository names) for the *datasets themselves* that were used in the experiments. |
| Dataset Splits | Yes | Results are averaged over 208 different unseen evaluation buoyancy force values between 0.2 and 0.5. For training, we used a dataset with higher temporal resolution of ∆t = 0.375 s and get equal number of trajectories from uniformly sampling 832 different external buoyancy force values, f = (0, f)T in Equation 5, in the range 0.2 ≤ f ≤ 0.5, using input fields at one timestep. |
| Hardware Specification | Yes | All experiments used 4 × 16 GB NVIDIA V100 machines for training. We warmup the benchmark for 10 iterations and report average runtimes over 100 runs on a single 16 GB NVIDIA V100 machine with input batch size of 8. |
| Software Dependencies | No | Source code for our PyTorch benchmark framework is available at https://github.com/microsoft/pdearena. We optimized models using the AdamW optimizer (Kingma & Ba, 2014; Loshchilov & Hutter, 2019). We used cosine annealing as learning rate scheduler (Loshchilov & Hutter, 2016). The paper mentions software such as PyTorch, AdamW, and cosine annealing but does not specify their version numbers. |
| Experiment Setup | Yes | We optimized models using the AdamW optimizer (Kingma & Ba, 2014; Loshchilov & Hutter, 2019) for 50 epochs and minimized the summed mean squared error. We used cosine annealing as learning rate scheduler (Loshchilov & Hutter, 2016) with a linear warmup. For FNO models, we optimized number of layers, number of channels, and number of Fourier modes. For U-Net like architectures, especially for U-Netatt, we specifically needed to optimize the maximum learning rate to be lower (10⁻⁴). We used an effective batch size of 32 for training. We used the best learning rates of [10⁻⁴, 2 × 10⁻⁴] and weight decay of 10⁻⁵. |