reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Efficient Training of Neural Stochastic Differential Equations by Matching Finite Dimensional Distributions

Authors: Jianxin Zhang, Josh Viktorov, Doosan Jung, Emily Pitler

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate that FDM achieves superior performance, consistently outperforming existing methods in terms of both computational efficiency and generative quality. ... Section 6 details the experimental setup and results, demonstrating the superiority of FDM in terms of both computational efficiency and generative performance across several benchmark datasets. ... Our experiments are conducted across five real-world datasets: energy prices, bonds, metal prices, U.S. stock indices, and exchange rates, as well as one synthetic dataset, the Rough Bergomi model.
Researcher Affiliation	Collaboration	Jianxin Zhang EECS, University of Michigan Ann Arbor, MI 48109 EMAIL Josh Viktorov, Doosan Jung, Emily Pitler Cisco Systems New York, NY Lakewood, CO San Jose, CA EMAIL
Pseudocode	Yes	In Algorithm 1, we present the concrete finite dimensional matching (FDM) algorithm derived from Theorem 2 to train a Neural SDE Xθ. Algorithm 1: Finite Dimensional Matching (FDM)
Open Source Code	Yes	1code available at https://github.com/Z-Jianxin/FDM
Open Datasets	Yes	Our experiments are conducted across five real-world datasets: energy prices, bonds, metal prices, U.S. stock indices, and exchange rates, as well as one synthetic dataset, the Rough Bergomi model2. ... 2All real-world datasets are obtained from https://www.dukascopy.com/swiss/english/ marketwatch/historical/
Dataset Splits	Yes	For the energy price and bonds datasets, we reserve the latest 20% of the data for testing, evaluating the trained models via the KS test on generated sequences against unseen future sequences. ... For our experiments, we first follow Issa et al. (2023) to train and evaluate the models on three datasets metal prices, stock indices, and exchange rates using sequences with 64 timestamps and random train-test splits.
Hardware Specification	Yes	All models are trained and evaluated on a single NVIDIA H100 GPU.
Software Dependencies	No	The paper mentions using fully connected neural networks, optimizers, and the RBF kernel, but does not provide specific version numbers for any software libraries or frameworks (e.g., Python, PyTorch, TensorFlow, scikit-learn versions).
Experiment Setup	Yes	For all experiments, we use fully connected neural networks to parameterize the drift and diffusion terms, with hyperparameters and preprocessings suggested in Issa et al. (2023). We choose s to be s(P, z) = 1/2EZ,Z'~P k(Z, Z') - EZ~P k(Z, z) where k is the rbf kernel with unit kernel bandwidth. In particular, following Issa et al. (2023), we let our method and Trunc Sig train for 10000 steps, while SDE-GAN trains for 5000 steps and Sig Ker for 4000 steps, to normalize the training time.