CirT: Global Subseasonal-to-Seasonal Forecasting with Geometry-inspired Transformer

Authors: Yang Liu, Zinan Zheng, Jiashun Cheng, Fugee Tsung, Deli Zhao, Yu Rong, Jia Li

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on the Earth Reanalysis 5 (ERA5) reanalysis dataset demonstrate our model yields a significant improvement over the advanced data-driven models, including Pangu Weather and Graph Cast, as well as skillful ECMWF systems. Additionally, we empirically show the effectiveness of our model designs and high-quality prediction over spatial and temporal dimensions. Section 4 is titled "EXPERIMENTS", detailing dataset usage, metrics, baselines, and results. Section 4.2 presents "ABLATION STUDY".
Researcher Affiliation Collaboration The authors are affiliated with "The Hong Kong University of Science and Technology (Guangzhou)", "The Hong Kong University of Science and Technology" (academic institutions), and "DAMO Academy, Alibaba Group" (an industry institution). This mix indicates a collaborative affiliation.
Pseudocode No The paper describes the model architecture and processes (circular patching, transformer encoder, DFT, IDFT) using textual descriptions and mathematical equations. Figure 2 shows an architecture diagram. However, it does not contain a clearly labeled pseudocode block or algorithm.
Open Source Code Yes The code link is: https://github.com/compasszzn/CirT.
Open Datasets Yes Dataset We evaluate the effectiveness of Cir T on the ERA5 reanalysis dataset (Hersbach et al., 2020).
Dataset Splits Yes We use the 1979 2016 (38 years of) data for training, the 2017 data for validation, and the 2018 for testing.
Hardware Specification Yes All models are implemented based on Pytorch Lightning, trained on 8 Ge Force RTX 4090 GPU. We perform the inference of Download models in NVIDIA A800 80G GPU.
Software Dependencies No The paper mentions "Pytorch Lightning" as the framework used ("All models are implemented based on Pytorch Lightning"), but does not specify its version or any other software dependencies with version numbers.
Experiment Setup Yes We use the following hyper-parameters for all direct training baselines: Batch size 16, the hidden dimension 256, and the attention head 16. All models are set to 8 layers and the learning rate is 0.01. All models are trained for 20 epochs.