reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

CViT: Continuous Vision Transformer for Operator Learning

Authors: Sifan Wang, Jacob Seidman, Shyam Sankaran, Hanwen Wang, George Pappas, Paris Perdikaris

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate CVi T s effectiveness across a diverse range of partial differential equation (PDE) systems, including fluid dynamics, climate modeling, and reaction-diffusion processes. Our comprehensive experiments show that CVi T achieves state-of-the-art performance on multiple benchmarks, often surpassing larger foundation models, even without extensive pretraining and roll-out fine-tuning.
Researcher Affiliation	Collaboration	Sifan Wang1, Jacob H. Seidman3,4,5, Shyam Sankaran3, Hanwen Wang2, George J. Pappas4 Paris Perdikaris3 1Institution for Foundation of Data Science, Yale University 2Graduate Program in Applied Mathematics and Computational Science, University of Pennsylvania 3Department of Mechanical Engineering and Applied Mechanics, University of Pennsylvania 4Department of Electrical and Systems Engineering, University of Pennsylvania 5Reality Defender EMAIL EMAIL EMAIL
Pseudocode	No	The paper describes the architecture and components of CVi T using text, equations, and diagrams (Figure 1), but it does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	All data and code are publicly available at https://github.com/Predictive Intelligence Lab/cvit.
Open Datasets	Yes	All data and code are publicly available at https://github.com/Predictive Intelligence Lab/cvit. We make use of the datasets and problem setup established by Hoop et al. (de Hoop et al., 2022). The dataset is generated by PDEArena (Gupta & Brandstetter, 2022) using Speedy Weather.jl... The 2D Navier-Stokes data is generated by PDEArena (Gupta & Brandstetter, 2022)... We use the dataset generated by PDEBench (Takamoto et al., 2022)...
Dataset Splits	Yes	For the training and evaluation, we considered a split of 20,000 samples used for training, 10,000 for validation, and 10,000 for testing. All models are trained with 5,600 trajectories and evaluated on the remain 1,000 trajectories. The models are trained with 6,500 trajectories and tested on the remaining 1,300 trajectories. The models are trained with 9,000 trajectories and tested on the remaining 1,000 trajectories. The models are trained with 900 trajectories and tested on the remaining 100 trajectories.
Hardware Specification	Yes	All experiments were performed on a single Nvidia RTX A6000 GPU.
Software Dependencies	No	We also thank the developers of the software that enabled our research, including JAX Bradbury et al. (2018), Matplotlib Hunter (2007), and Num Py Harris et al. (2020). The paper mentions software tools used but does not provide specific version numbers for these dependencies.
Experiment Setup	Yes	We employ Adam W optimizer (Kingma & Ba, 2014; Loshchilov & Hutter, 2017) with a weight decay 10 5. Our learning rate schedule includes an initial linear warm-up phase of 5,000 steps, starting from zero and gradually increasing to 10 3, followed by an exponential decay at a rate of 0.9 for every 5,000 steps. The loss function is a one-step mean squared error (MSE)... All models are trained for 2 105 iterations with a batch size B = 64. Within each batch, we randomly sample Q = 1,024 query coordinates from the grid and corresponding output labels. ...we use a patch size of 8 8 for tokenizing inputs. We also employ a decoder with a single cross-attention Transformer block for all configurations. The grid resolution is set to the spatial resolution of each dataset. The latent dimension of grid features is set to 512... we use β = 105 to ensure sufficient locality of the interpolated features.