reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Score-Based Diffusion Models in Function Space

Authors: Jae Hyun Lim, Nikola B. Kovachki, Ricardo Baptista, Christopher Beckham, Kamyar Azizzadenesheli, Jean Kossaifi, Vikram Voleti, Jiaming Song, Karsten Kreis, Jan Kautz, Christopher Pal, Arash Vahdat, Anima Anandkumar

JMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We theoretically and numerically verify the applicability of our approach on a set of function-valued problems, including generating solutions to the Navier Stokes equation viewed as the push-forward distribution of forcings from a Gaussian Random Field (GRF), as well as volcano In SAR and MNIST-SDF.1... 5. Numerical Experiments In all examples, we use the Fourier neural operator (FNO) (Li et al., 2020a), U-shaped neural operator(UNO) (Rahman et al., 2023) as they are well-deﬁned architecture for maps between Hilbert spaces Li et al. (2020a); Kovachki et al. (2021). The goal of our numerics is to showcase the simple message that by employing trace-class noise and a consistent architecture for function space data, we obtain dimension (i.e., resolution)-independent results, observed by varying the discretization of the data. All experiments are done by solving Equation 16 in a way similar to Song and Ermon (2019), generalized to function spaces; see Appendix D.
Researcher Affiliation	Collaboration	Jae Hyun Lim EMAIL Université de Montréal Nikola B. Kovachki EMAIL NVIDIA Corporation Ricardo Baptista EMAIL California Institute of Technology Christopher Beckham EMAIL Polytechnique Montréal Kamyar Azizzadenesheli EMAIL NVIDIA Corporation Jean Kossaiﬁ EMAIL NVIDIA Corporation Vikram Voleti EMAIL Université de Montréal Jiaming Song EMAIL NVIDIA Corporation Karsten Kreis EMAIL NVIDIA Corporation Jan Kautz EMAIL NVIDIA Corporation Christopher Pal EMAIL Polytechnique Montréal & Canada CIFAR AI Chair Arash Vahdat EMAIL NVIDIA Corporation Anima Anandkumar EMAIL NVIDIA Corporation & California Institute of Technology
Pseudocode	Yes	Algorithm 1 Annealed Langevin Dynamics Input: Fθ, u0 H, {σt}T t=1, M N, ϵ > 0 for t = 1 to T do ht = ϵσ2 t /σ2 T . for n = 0 to M 1 do η(t) n N(0, C) un+1 = un + ht Fθ(un, t) + 2htη(t) n end for u0 = u M end for
Open Source Code	Yes	1. The code for this project is publicly available at https://github.com/lim0606/ddo
Open Datasets	Yes	We theoretically and numerically verify the applicability of our approach on a set of function-valued problems, including generating solutions to the Navier Stokes equation viewed as the push-forward distribution of forcings from a Gaussian Random Field (GRF), as well as volcano In SAR and MNIST-SDF.1 Keywords: Diﬀusion models, Score matching, Generative models, Operator learning, Function spaces
Dataset Splits	No	The paper does not provide specific training/test/validation dataset splits (e.g., percentages, exact counts, or predefined split references) needed for reproducibility. While it mentions generating N=10,000 samples for training for Gaussian Mixture and Navier-Stokes, and the total size for Volcano (4096 data points), it does not specify how these datasets were further partitioned into training, validation, and test sets for evaluation.
Hardware Specification	Yes	To ensure a fair comparison, we limited the size of all models to a similar scale (with the number of parameters kept below 3 million) and conducted all experiments using one NVIDIA A100 GPU. Additionally, we included a DDO model, whose size is 10 times larger for reference.
Software Dependencies	No	The paper mentions software components like "Fourier neural operator (FNO)", "U-shaped neural operator (UNO)", and "Adam optimizer", but it does not provide specific version numbers for these software dependencies or the frameworks used (e.g., PyTorch, TensorFlow).
Experiment Setup	Yes	In all examples we train by picking I = [10] and sample with Algorithm 1 by ﬁxing M = 200 and ϵ = 2 10 5. We choose σ1 = 1.0 and σ10 = 0.01 with all other σ parameters deﬁned by a geometric sequence... We train with the Adam optimizer for a total of 300 epochs and an initial learning rate 10 3, which is decayed by half every 50 epochs.