Finite-Time Analysis of Discrete-Time Stochastic Interpolants

Authors: Yuhao Liu, Yu Chen, Rui Hu, Longbo Huang

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, numerical experiments are conducted on the discrete-time sampler to corroborate our theoretical findings.
Researcher Affiliation Academia 1IIIS, Tsinghua University, Beijing, China. Correspondence to: Longbo Huang <EMAIL>.
Pseudocode No The paper describes the discrete-time sampler using Equation (7), which is a mathematical formula for an update rule, but it does not present a structured pseudocode or algorithm block.
Open Source Code No The paper does not contain any explicit statement about releasing source code or provide a link to a code repository.
Open Datasets Yes We implement the discretized sampler as defined in Equation (7), and evaluate its performance on on two-dimensional datasets (primarily from Grathwohl et al. 2019) and Gaussian mixtures.
Dataset Splits No The paper mentions using "sampled data points" and generating data from "Gaussian mixtures" to visualize densities and estimate KL divergence. It does not provide specific training/test/validation dataset splits for reproducibility.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments, such as CPU/GPU models or memory specifications.
Software Dependencies No The paper mentions the use of 'Adam optimizer (Kingma & Ba, 2015)' and 'ReLU activation functions (Nair & Hinton, 2010)' but does not provide specific version numbers for any software libraries, programming languages, or development environments.
Experiment Setup Yes We employ I(t, x0, x1) = tx1 + (1 t)x0, γ(t) = p 2t(1 t) and ϵ = 1 in our experiments. We set t0 = 0.001 and t N = 0.999 to ensure that the initial density ρ(t0) is close to ρ0 and the estimated density ρ(t N) closely approximates ρ1. To train the estimator ˆb F (t, x), we leverage a simple quadratic objective (see Appendix A for details) whose optimizer is the real drift b F (t, x). We employ the Adam optimizer (Kingma & Ba, 2015) to train the network using the gradient computed on the empirical loss. The MLP architecture consists of three hidden layers, each with 256 neurons, followed by Re LU activation functions (Nair & Hinton, 2010).