reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

SuperBench: A Super-Resolution Benchmark Dataset for Scientific Machine Learning

Authors: Pu Ren, N. Benjamin Erichson, Junyi Guo, Shashank Subramanian, Omer San, Zarija Lukic, Michael W. Mahoney

DMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We investigate a range of degradation functions tailored for scientific data. In addition to commonly used methods like uniform and bicubic downscaling, we explore the use of LR simulations as inputs and consider the introduction of noise to the input data. This suits Super Bench for a thorough assessment and effective comparison of different SR methods. We benchmark existing SR methods on Super Bench. By employing both data and physicalcentric metrics, our analysis provides valuable insights into the performance of various SR approaches.
Researcher Affiliation	Academia	Pu Ren1, EMAIL N. Benjamin Erichson1,2, EMAIL Junyi Guo2 EMAIL Shashank Subramanian1 EMAIL Omer San3 EMAIL Zarija Lukić1 EMAIL Michael W. Mahoney1,2,4 EMAIL 1 Lawrence Berkeley National Lab 2 International Computer Science Institute 3 University of Tennessee, Knoxville 4 University of California at Berkeley
Pseudocode	No	The paper describes various Super-Resolution (SR) methods and their training protocols but does not include any explicit pseudocode blocks or algorithms.
Open Source Code	Yes	To address this, we introduce Super Bench (https://github.com/erichson/Super Bench), the first benchmark dataset featuring high-resolution datasets (up to 2048 2048 dimensions), including data from fluid flows, cosmology, and weather. The code for processing the datasets, running baseline models, and evaluating model performance is publicly available in our Git Hub repository. The README file contains system requirements, installation instructions, and running examples. Detailed training information is provided in Appendix C.
Open Datasets	Yes	To address this, we introduce Super Bench (https://github.com/erichson/Super Bench), the first benchmark dataset featuring high-resolution datasets (up to 2048 2048 dimensions), including data from fluid flows, cosmology, and weather. Super Bench is hosted on the shared file systems of the National Energy Research Scientific Computing Center (NERSC) platform. The data is publicly available with the following link (https://portal.nersc.gov/project/dasrepo/superbench). Users can download the dataset locally by either clicking on the provided link or using the wget command in the terminal. The ERA5 dataset provided in Super Bench is available in the public domain.
Dataset Splits	Yes	Table 1: Summary of datasets in Super Bench. LR sim. denotes that the LR simulation data is included in this dataset as inputs. Datasets Spatial resolution # samples (train/valid/test) File size Fluid flow data (Re = 16000) 2048 2048 1000 / 200 / 200 66GB , (w/ LR sim.) 2048 2048 1200 / 200 / 200 80GB ... Appendix A. Additional Data Details: In Super Bench, we provide two perspectives to evaluate the model performance: interpolation and extrapolation. The data split has already been done in the stage of preprocessing. We use train , valid1 , valid2 , test1 , and test2 to represent the training set, in-distribution validation set, out-of-distribution validation set, in-distribution testing set, out-ofdistribution testing set. The detailed data splitting is presented in Table 1.
Hardware Specification	Yes	Note that all the experiments in Super Bench can be reproduced on an NVIDIA A100 GPU with 40GB memory, which ensures accessibility for researchers with standard computational resources. All models are trained from scratch on Nvidia A100 GPUs. The fluid and cosmology data are simulated using multiple Nvidia Tesla A100 GPU nodes in Permultter, which is a supercomputer at LBNL.
Software Dependencies	No	The paper mentions specific optimizers like ADAM (Kingma and Ba, 2015) and AdamW (Loshchilov and Hutter, 2019) and various deep learning models (SRCNN, Sub-pixel CNN, SRGAN, EDSR, WDSR, FNO, Swin IR), but it does not specify versions of programming languages (e.g., Python), deep learning frameworks (e.g., PyTorch, TensorFlow), or other libraries used for implementation.
Experiment Setup	Yes	In this section, we provide additional training details for all baseline models. For all datasets in Super Bench, the patch size is defined as 128 128. The number of patches per snapshot is selected as 8. ... The learning rate is set as 1 10 3 and the weight decay is 1 10 5. We train the SRCNN models for 200 epochs. ... The batch size is set to 32 and Mean Squared Error (MSE) is employed as the loss function. ... For the Generator part of SRGAN, we employ 16 residual blocks, each with a hidden channel dimension of 64. ... The learning rate is configured as 2 10 4 with a weight decay of 1 10 6. All the datasets are trained for 600 epochs using Adam optimizer, with a batch size of 512 for parallel training across 4 GPUs.