Tensor-Var: Efficient Four-Dimensional Variational Data Assimilation

Authors: Yiming Yang, Xiaoyuan Cheng, Daniel Giles, Sibo Cheng, Yi He, Xiao Xue, Boli Chen, Yukun Hu

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To evaluate our proposed method, the comparison is conducted on a series of benchmark domains, representing the optimization problem (2) of increasing complexity, including (1) Lorenz 96 system (Lorenz, 1996) with ns = 40 and 80. (2) Kuramoto-Sivashinsky (KS) equation... Experiments on chaotic systems and global weather prediction with real-time observations show that Tensor-Var outperforms conventional and DL hybrid 4D-Var baselines in accuracy while achieving a 10to 20-fold speed improvement.
Researcher Affiliation Collaboration 1University College London, London, United Kingdom 2CEREA, ENPC and EDF R&D, Institut Polytechnique de Paris, Paris, France. Correspondence to: Yiming Yang <EMAIL>, Yukun Hu <EMAIL>.
Pseudocode Yes We present the pseudo-algorithms in Appendix D.2, as shown in Algorithms 2 and 3. Algorithm 1 Tensor-Var training with deep feature Algorithm 2 Tensor-Var training with kernel feature Algorithm 3 Tensor-Var assimilation-forecasting
Open Source Code No The paper does not contain any explicit statements about the release of source code for the described methodology, nor does it provide any links to a code repository.
Open Datasets Yes To assess the practical applicability of Tensor-Var, we evaluate its performance in global mediumrange weather forecasting (i.e. 3-5 days) by using a subset of the ECMWF Reanalysis v5 (ERA5) dataset for training and testing, with further details in Subsection 4.2 and (Rasp et al., 2024). Moreover, we incorporate observation locations extracted from the real-time weather satellite track into the NWP experiment with higher spatial-resolution in Subsection 4.3. Section 4.3: We extract satellite track data (latitude and longitude coordinates) from Celes Trak2 for the same periods as Section 4.2, matching it with ERA5 data to generate practical observations. These observations include satellite locations within two hours before the assimilation time, sampled at half-hour intervals, with an average coverage of approximately 6%, see Figure 5. Celes Trak provides public orbital data for a wide range of satellites, including those with meteorological sensors at www.celestrak.com.
Dataset Splits Yes We trained all models in ERA5 data from 1979-01-01 to 2016-01-01 and tested on data post-2018, with a qualitative evaluation shown for 2018-01-01 00:00 in Figure 9. Section E.3: For model training, we use ERA5 data from 1979-01-01 to 2016-01-01, separating data from post-2018 for testing. There are 51,100 consecutive system states with generated observations for training and 2,920 data for testing.
Hardware Specification Yes All training is conducted on a workstation with a 48-core AMD 7980X CPU and an Nvidia Ge Force 4090 GPU. Figure 2 ... shows the evaluation times on an Nvidia RTX-4090 GPU.
Software Dependencies No For all baseline methods, we employ the L-BFGS algorithm for Variational Data Assimilation (Var-DA) optimization, implemented in JAX (Bradbury et al., 2018). ... For Tensor-Var, we apply interior-point quadratic programming to solve the linearized 4D-Var optimization, using CVXPY (Diamond & Boyd, 2016). Section E.5: kernel PCA projected to the first 60 and 120 eigen-coordinates in scikit-learn (Pedregosa et al., 2011) ... All models are trained with the Adam optimizer (Kingma, 2014).
Experiment Setup Yes For Tensor-Var, the history length m is selected using a cross-validation approach as an ablation study in Subsection 4.4, and the objective function in the feature space is minimized using quadratic programming, implemented via CVXPY (Diamond & Boyd, 2016). The baselines use the L-BFGS method (Nocedal & Wright, 2006) with 10 history vectors for the Hessian approximation. For 4D-Var baselines, we consider two cases (with and without adjoint models). The background state sb is set to the mean state of the training set. Section E: All models are trained with the Adam optimizer (Kingma, 2014) for 200 epochs, using batch sizes from 256 to 1024 for stable operator estimation.