Efficient Training of Neural Stochastic Differential Equations by Matching Finite Dimensional Distributions
Authors: Jianxin Zhang, Josh Viktorov, Doosan Jung, Emily Pitler
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate that FDM achieves superior performance, consistently outperforming existing methods in terms of both computational efficiency and generative quality. ... Section 6 details the experimental setup and results, demonstrating the superiority of FDM in terms of both computational efficiency and generative performance across several benchmark datasets. ... Our experiments are conducted across five real-world datasets: energy prices, bonds, metal prices, U.S. stock indices, and exchange rates, as well as one synthetic dataset, the Rough Bergomi model. |
| Researcher Affiliation | Collaboration | Jianxin Zhang EECS, University of Michigan Ann Arbor, MI 48109 EMAIL Josh Viktorov, Doosan Jung, Emily Pitler Cisco Systems New York, NY Lakewood, CO San Jose, CA EMAIL |
| Pseudocode | Yes | In Algorithm 1, we present the concrete finite dimensional matching (FDM) algorithm derived from Theorem 2 to train a Neural SDE Xθ. Algorithm 1: Finite Dimensional Matching (FDM) |
| Open Source Code | Yes | 1code available at https://github.com/Z-Jianxin/FDM |
| Open Datasets | Yes | Our experiments are conducted across five real-world datasets: energy prices, bonds, metal prices, U.S. stock indices, and exchange rates, as well as one synthetic dataset, the Rough Bergomi model2. ... 2All real-world datasets are obtained from https://www.dukascopy.com/swiss/english/ marketwatch/historical/ |
| Dataset Splits | Yes | For the energy price and bonds datasets, we reserve the latest 20% of the data for testing, evaluating the trained models via the KS test on generated sequences against unseen future sequences. ... For our experiments, we first follow Issa et al. (2023) to train and evaluate the models on three datasets metal prices, stock indices, and exchange rates using sequences with 64 timestamps and random train-test splits. |
| Hardware Specification | Yes | All models are trained and evaluated on a single NVIDIA H100 GPU. |
| Software Dependencies | No | The paper mentions using fully connected neural networks, optimizers, and the RBF kernel, but does not provide specific version numbers for any software libraries or frameworks (e.g., Python, PyTorch, TensorFlow, scikit-learn versions). |
| Experiment Setup | Yes | For all experiments, we use fully connected neural networks to parameterize the drift and diffusion terms, with hyperparameters and preprocessings suggested in Issa et al. (2023). We choose s to be s(P, z) = 1/2EZ,Z'~P k(Z, Z') - EZ~P k(Z, z) where k is the rbf kernel with unit kernel bandwidth. In particular, following Issa et al. (2023), we let our method and Trunc Sig train for 10000 steps, while SDE-GAN trains for 5000 steps and Sig Ker for 4000 steps, to normalize the training time. |