OmniArch: Building Foundation Model for Scientific Computing
Authors: Tianyu Chen, Haoyi Zhou, Ying Li, Hao Wang, Chonghan Gao, Rongye Shi, Shanghang Zhang, Jianxin Li
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present Omni Arch, the first prototype aiming at solving multi-scale and multi-physics scientific computing problems with physical alignment. We addressed all three challenges with one unified architecture. Its pre-training stage contains a Fourier Encoder-decoder fading out the disharmony across separated dimensions and a Transformer backbone integrating quantities through temporal dynamics, and the novel PDEAligner performs physics-informed fine-tuning under flexible conditions. As far as we know, we first conduct 1D-2D-3D united pre-training on the PDEBench, and it sets not only new performance benchmarks for 1D, 2D, and 3D PDEs but also demonstrates exceptional adaptability to new physics via in-context and zero-shot learning approaches, which supports realistic engineering applications and foresight physics discovery. |
| Researcher Affiliation | Academia | 1SKLCCSE, School of Computer Science and Engineering, Beihang University, Beijing, China 2SKLMIP, School of Computer Science, Peking University, Beijing, China 3School of Artificial Intelligence, Beihang University, Beijing, China. |
| Pseudocode | No | The paper describes the architecture and methodology in detail using prose and mathematical equations but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | We release our models base and large variants1, concurrently addressing 1D, 2D, and 3D PDEs. 1https://openi.pcl.ac.cn/cty315/Omni Arch |
| Open Datasets | Yes | We collect 1D, 2D, and 3D datasets from the public PDEBench and PDEArena. ... PDEBench (Takamoto et al., 2022), PDEArena (Gupta & Brandstetter, 2022) |
| Dataset Splits | Yes | We structured the PDEBench data into distinct training, validation, and testing subsets. For one-dimensional (1D) PDEs, the training dataset comprises a selection from the CFD-1D, Reac Diff, Advection, Burgers, and diff-sorp datasets. From these, we reserve a random 10% sample of trajectories as the in-domain test set for each respective PDE equation. ... In the two-dimensional (2D) PDE case, we allocate 90% of trajectories from the CFD, diff-react, NSincom, and shallow water datasets for training. The remaining 10% form the in-domain test set. ... For three-dimensional (3D) PDEs, 90% of trajectories from the CFD-3D dataset are utilized for training, with the remaining 10% serving as the in-domain test set. |
| Hardware Specification | Yes | Fine-tuning is performed on an A40 GPU cluster, which has 40Gi B of memory per device. |
| Software Dependencies | No | The paper mentions software components like "LLa MA model", "BERT model", and "albert-math model" but does not specify their version numbers. |
| Experiment Setup | Yes | In our training process, the following strategies or decisions were made: Pre/Post Norm: Pre-norm, Norm Type: RMS Norm Type, Architecture: Decoder-Only, Attention-Type: Multi-scaled Attention, Position Embedding: Ro PE, Casual Masking: True We only evaluate the loss on the T + 1 physical fileds prediction. Hidden Size: 1024, initializer_range: 0.02, intermediate_size: 4096, num_attention_heads: 16. ... Table 9: Detailed setting of hyperparameters in pre-training the base and large models. ... Table 10: Detailed Fine-tuning Settings: The table provides the learning rate, width, modes, and batch size for 1D, 2D, and 3D data. |