reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

EWMoE: An Effective Model for Global Weather Forecasting with Mixture-of-Experts

Authors: Lihao Gan, Xin Man, Chenghong Zhang, Jie Shao

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct our evaluation on the ERA5 dataset using only two years of training data. Extensive experiments demonstrate that EWMo E outperforms current models such as Four Cast Net and Clima X at all forecast time, achieving competitive performance compared with the state-of-the-art models Pangu-Weather and Graph Cast in evaluation metrics such as Anomaly Correlation Coefficient (ACC) and Root Mean Square Error (RMSE). Additionally, ablation studies indicate that applying the Mo E architecture to weather forecasting offers significant advantages in improving accuracy and resource efficiency.
Researcher Affiliation	Academia	1University of Electronic Science and Technology of China, Chengdu, China 2Sichuan Artificial Intelligence Research Institute, Yibin, China 3Institute of Plateau Meteorology, China Meteorological Administration, Chengdu, China EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes methods and equations but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Our implementation code is available at https://github.com/technomii/EWMo E.
Open Datasets	Yes	ERA5 (Hersbach et al. 2020) is a publicly available atmospheric reanalysis dataset produced by the European Centre for Medium-Range Weather Forecasts (ECMWF).
Dataset Splits	Yes	In addition, to demonstrate the effectiveness of our model in the case of limited data and computing resources, we use two years of data for training (2015 and 2016), one year for validation (2017), and one year for testing (2018).
Hardware Specification	Yes	The training of EWMo E was completed under 9 days on 2 Nvidia 3090 GPUs.
Software Dependencies	No	The paper mentions using the Adam W optimizer but does not specify versions for any key software libraries or frameworks (e.g., PyTorch, TensorFlow, etc.).
Experiment Setup	Yes	For each input data sample from the ERA5 dataset, it can be represented as an image with 20 channels. We set the patch size as 8 8, and the EWMo E model consists of encoders with depth=6, dim=768 and decoders with depth=6, dim=512. Each encoder has a Mo E layer, and each Mo E layer consists of 20 independent experts. Specifically, in the gating network of each Mo E layer, we use top-2 routing to select the top-2 ranked experts for forward propagation of training data. We employ the Adam W optimizer with two momentum parameters β1=0.9 and β2=0.95, and set the weight decay to 0.05.