CirT: Global Subseasonal-to-Seasonal Forecasting with Geometry-inspired Transformer
Authors: Yang Liu, Zinan Zheng, Jiashun Cheng, Fugee Tsung, Deli Zhao, Yu Rong, Jia Li
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on the Earth Reanalysis 5 (ERA5) reanalysis dataset demonstrate our model yields a significant improvement over the advanced data-driven models, including Pangu Weather and Graph Cast, as well as skillful ECMWF systems. Additionally, we empirically show the effectiveness of our model designs and high-quality prediction over spatial and temporal dimensions. Section 4 is titled "EXPERIMENTS", detailing dataset usage, metrics, baselines, and results. Section 4.2 presents "ABLATION STUDY". |
| Researcher Affiliation | Collaboration | The authors are affiliated with "The Hong Kong University of Science and Technology (Guangzhou)", "The Hong Kong University of Science and Technology" (academic institutions), and "DAMO Academy, Alibaba Group" (an industry institution). This mix indicates a collaborative affiliation. |
| Pseudocode | No | The paper describes the model architecture and processes (circular patching, transformer encoder, DFT, IDFT) using textual descriptions and mathematical equations. Figure 2 shows an architecture diagram. However, it does not contain a clearly labeled pseudocode block or algorithm. |
| Open Source Code | Yes | The code link is: https://github.com/compasszzn/CirT. |
| Open Datasets | Yes | Dataset We evaluate the effectiveness of Cir T on the ERA5 reanalysis dataset (Hersbach et al., 2020). |
| Dataset Splits | Yes | We use the 1979 2016 (38 years of) data for training, the 2017 data for validation, and the 2018 for testing. |
| Hardware Specification | Yes | All models are implemented based on Pytorch Lightning, trained on 8 Ge Force RTX 4090 GPU. We perform the inference of Download models in NVIDIA A800 80G GPU. |
| Software Dependencies | No | The paper mentions "Pytorch Lightning" as the framework used ("All models are implemented based on Pytorch Lightning"), but does not specify its version or any other software dependencies with version numbers. |
| Experiment Setup | Yes | We use the following hyper-parameters for all direct training baselines: Batch size 16, the hidden dimension 256, and the attention head 16. All models are set to 8 layers and the learning rate is 0.01. All models are trained for 20 epochs. |