Efficient Training of Neural Stochastic Differential Equations by Matching Finite Dimensional Distributions

Authors: Jianxin Zhang, Josh Viktorov, Doosan Jung, Emily Pitler

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate that FDM achieves superior performance, consistently outperforming existing methods in terms of both computational efficiency and generative quality. ... Section 6 details the experimental setup and results, demonstrating the superiority of FDM in terms of both computational efficiency and generative performance across several benchmark datasets. ... Our experiments are conducted across five real-world datasets: energy prices, bonds, metal prices, U.S. stock indices, and exchange rates, as well as one synthetic dataset, the Rough Bergomi model.
Researcher Affiliation Collaboration Jianxin Zhang EECS, University of Michigan Ann Arbor, MI 48109 EMAIL Josh Viktorov, Doosan Jung, Emily Pitler Cisco Systems New York, NY Lakewood, CO San Jose, CA EMAIL
Pseudocode Yes In Algorithm 1, we present the concrete finite dimensional matching (FDM) algorithm derived from Theorem 2 to train a Neural SDE Xθ. Algorithm 1: Finite Dimensional Matching (FDM)
Open Source Code Yes 1code available at https://github.com/Z-Jianxin/FDM
Open Datasets Yes Our experiments are conducted across five real-world datasets: energy prices, bonds, metal prices, U.S. stock indices, and exchange rates, as well as one synthetic dataset, the Rough Bergomi model2. ... 2All real-world datasets are obtained from https://www.dukascopy.com/swiss/english/ marketwatch/historical/
Dataset Splits Yes For the energy price and bonds datasets, we reserve the latest 20% of the data for testing, evaluating the trained models via the KS test on generated sequences against unseen future sequences. ... For our experiments, we first follow Issa et al. (2023) to train and evaluate the models on three datasets metal prices, stock indices, and exchange rates using sequences with 64 timestamps and random train-test splits.
Hardware Specification Yes All models are trained and evaluated on a single NVIDIA H100 GPU.
Software Dependencies No The paper mentions using fully connected neural networks, optimizers, and the RBF kernel, but does not provide specific version numbers for any software libraries or frameworks (e.g., Python, PyTorch, TensorFlow, scikit-learn versions).
Experiment Setup Yes For all experiments, we use fully connected neural networks to parameterize the drift and diffusion terms, with hyperparameters and preprocessings suggested in Issa et al. (2023). We choose s to be s(P, z) = 1/2EZ,Z'~P k(Z, Z') - EZ~P k(Z, z) where k is the rbf kernel with unit kernel bandwidth. In particular, following Issa et al. (2023), we let our method and Trunc Sig train for 10000 steps, while SDE-GAN trains for 5000 steps and Sig Ker for 4000 steps, to normalize the training time.