reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Gridded Transformer Neural Processes for Spatio-Temporal Data

Authors: Matthew Ashman, Cristiana Diaconu, Eric Langezaal, Adrian Weller, Richard E Turner

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our method consistently outperforms a range of strong baselines in various synthetic and real-world regression tasks involving large-scale data, while maintaining competitive computational efficiency. Experiments with weather data highlight the potential of gridded TNPs and serve as just one example of a domain where they can have a significant impact.
Researcher Affiliation	Academia	1University of Cambridge 2University of Amsterdam 3Alan Turing Institute. Correspondence to: Matthew Ashman <EMAIL>, Cristiana Diaconu <EMAIL>.
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks within its content. It refers to external pseudocode in Appendix B.1: "Pseudo-code for a forward pass through these models is provided in Algorithms 3 and 4 in Ashman et al. (2024b)."
Open Source Code	Yes	We also provide a public implementation of gridded TNPs in the repository https://github.com/cambridge-mlg/gridded-tnp.
Open Datasets	Yes	We perform two experiments on data from the ERA5 reanalysis by the European Centre for Medium-Range Weather Forecasts (ECMWF; Hersbach et al. 2020). We construct each context dataset by combining the t2m at a random subset of 9, 957 weather station locations ... extracted from the Had ISD dataset (Dunn et al., 2012). To illustrate the generality of our framework, we include an additional study in Appendix G.6 using the large-scale EAGLE fluid-dynamics dataset (Janny et al., 2023).
Dataset Splits	Yes	We train on data between 2009-2017, validate on 2018 and test on 2019.
Hardware Specification	Yes	Training and inference for all models is performed on one NVIDIA Ge Force RTX 2080 Ti. Training and inference are performed using a single NVIDIA A100 80GB with 32 CPU cores.
Software Dependencies	No	The paper mentions software like GPy Torch, Adam W optimizer, and U-Net architecture, but does not provide specific version numbers for any of these or other key software components, which is required for reproducibility.
Experiment Setup	Yes	For all experiments and all models, we use the Adam W optimiser (Loshchilov & Hutter, 2019) with a fixed learning rate of 5 10 4 and apply gradient clipping to gradients with magnitude greater than 0.5. In all experiments, we use C = 128, a kernel size of five or nine, and a stride of one. All MHSA / MHCA operations use H = 8 heads, each with DV = 16 dimensions. We use a Dz = DQK = 128 throughout. We train all models for 500, 000 iterations on 160, 000 pre-generated datasets using a batch size of eight.