reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Adaptive Flow Matching for Resolving Small-Scale Physics

Authors: Stathi Fotiadis, Noah D Brenowitz, Tomas Geffner, Yair Cohen, Michael Pritchard, Arash Vahdat, Morteza Mardani

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments on both synthetic and real datasets. For the real data, we use the same data as Mardani et al. (2023), the best estimates of the 25km and 2km observed atmospheric state available from meteorological agencies, centered around a region containing Taiwan. Additionally, we synthesize dynamics from a multiresolution variant of 2D fluid-flow, where we can control the degree of misalignment. Our results show that AFM consistently outperforms existing methods across various skill metrics. Overall , our main contributions are summarized as: Experiments on Real and Synthetic Data: Our results show AFM outperforms existing alternatives.
Researcher Affiliation	Collaboration	Stathi Fotiadis 1 Noah D Brenowitz 2 Tomas Geffner 2 Yair Cohen 2 Michael Pritchard 2 Arash Vahdat 2 Morteza Mardani 2 1Imperial College London 2NVIDIA. Correspondence to: Morteza Mardani <EMAIL>.
Pseudocode	Yes	Algorithm 1 AFM training Algorithm 2 AFM sampling
Open Source Code	No	The paper does not provide an explicit statement about releasing source code for the methodology, nor does it include a link to a code repository. It mentions "Animated PNGs of ensemble members for different models and τ are provided at https://t.ly/ZCq9Z" which is for results visualization, not code.
Open Datasets	Yes	Experiments on realworld weather data (e.g., 25 km to 2 km superresolution in Taiwan) and synthetic Kolmogorov flow indicate that our Adaptive Flow Matching (AFM) framework provides improvements over prior baselines particularly for more stochastic channels and consistently achieves bettercalibrated ensembles. We evaluate performance of the proposed Adaptive Flow Matching (AFM) model on two datasets: i) A regional weather downscaling dataset with real-world meteorological observations from Taiwan s Central Weather Administration (CWA) ii) a synthetic multiscale Kolmogorov flow dataset, designed to capture variable degrees of misalignment. Input coarse-resolution data at a 25 km scale comes from ERA5 (Hersbach et al., 2020), while the target fine-resolution 2 km scale data is sourced from the Central Weather Administration (CWA) (Central Weather Administration (CWA), 2021).
Dataset Splits	Yes	The training set comprises observations from 2018 to 2020, totaling 24,601 data points. The remaining data from 2021, consisting of 6,501 data points, were reserved for evaluation purposes. For each τ value, we generate a dataset comprising 100,000 training points and 500 test points.
Hardware Specification	Yes	Training is distributed across 8 DGX nodes, each with 8 A100 GPUs, using data parallelism and a total batch size of 512.
Software Dependencies	No	The paper mentions using EDM (Karras et al., 2022) and the Adam optimizer, but it does not specify explicit version numbers for any software libraries or tools, which is required for reproducibility.
Experiment Setup	Yes	The base channel size is 32, multiplied by [1, 2, 2, 4, 4] across layers. Please note this is a scaled down version of the model used in (Mardani et al., 2023) due to computational constrains. Attention resolution is set to 28. Optimizer: We use the Adam optimizer with a learning rate of 10^-4, β1 = 0.9, β2 = 0.99. Dropout is applied with a rate of 0.13. CFM, CDM, CFM and Corr Diff are trained for 50 million steps, whereas the regression UNet is trained for 20 million steps. For AFM we evaluate the encoder s RMSE every 10k steps and update σz using EMA with α = 0.9. Our sampling process employs Euler integration with 50 steps across all methods. We begin with a maximum noise variance σmax and decrease it to a minimum of σmin = 0.002. The value of σmax varies depending on the method: for CDM and Corr Diff, we use σmax = 800, as per the original implementation in (Mardani et al., 2023); for CFM, we set σmax = 1, as specified in (Lipman et al., 2022); and for AFM, we use the σz value learned during training.