reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Autoencoders in Function Space

Authors: Justin Bunker, Mark Girolami, Hefin Lambley, Andrew M. Stuart, T. J. Sullivan

JMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate FAE and FVAE on examples from the sciences, including problems governed by SDEs and PDEs, to discover low-dimensional latent structure from data, and use our models for inpainting, superresolution, and generative modelling, exploiting the ability to discretise the encoder and decoder on any mesh.
Researcher Affiliation	Academia	Justin Bunker EMAIL Department of Engineering University of Cambridge Cambridge, CB2 1TN, United Kingdom; Mark Girolami EMAIL Department of Engineering, University of Cambridge and Alan Turing Institute Cambridge, CB2 1TN, United Kingdom; Heﬁn Lambley EMAIL Mathematics Institute University of Warwick Coventry, CV4 7AL, United Kingdom; Andrew M. Stuart EMAIL Computing + Mathematical Sciences California Institute of Technology Pasadena, CA 91125, United States of America; T. J. Sullivan EMAIL Mathematics Institute & School of Engineering University of Warwick Coventry, CV4 7AL, United Kingdom
Pseudocode	No	No explicit pseudocode or algorithm blocks are provided in the paper. The methodology is described through mathematical formulations and textual explanations.
Open Source Code	Yes	Code accompanying the paper is available at https://github.com/htlambley/functional autoencoders.
Open Datasets	Yes	The training data set, based on that of Li et al. (2021), consists of 8,000 samples from Υ generated on a 64 64 grid using a pseudospectral solver, with a further 2,000 independent samples held out as an evaluation set. Further details are provided in Appendix B.5. The data we use is based on that provided online by Li et al. (2021), given on a 421 421 grid and generated through a ﬁnite-diﬀerence scheme.
Dataset Splits	Yes	The training data set, based on that of Li et al. (2021), consists of 8,000 samples from Υ generated on a 64 64 grid using a pseudospectral solver, with a further 2,000 independent samples held out as an evaluation set. The training data set is based on that of Li et al. (2021) and consists of 1,024 samples from Υ on a 421 421 grid, with a further 1,024 samples held out as an evaluation set.
Hardware Specification	Yes	All experiments were run on a single NVIDIA Ge Force RTX 4090 GPU with 24 GB of VRAM.
Software Dependencies	No	The paper mentions using the Adam optimiser, but does not provide specific version numbers for software libraries, frameworks (e.g., PyTorch, TensorFlow), or Python versions.
Experiment Setup	Yes	We train for 100,000 steps with initial learning rate 10 3 and an exponential decay of 0.98 applied every 1,000 steps, with batch size 32 and 4 Monte Carlo samples for Qθ z\|u. We use latent dimension d Z = 1, β = 1.2 and λ = 10.