Transport with Support: Data-Conditional Diffusion Bridges

Authors: Ella Tamir, Martin Trapp, Arno Solin

TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We assess the effectiveness of our method on synthetic and real-world data generation tasks and we show that the ISB generalises well to high-dimensional data, is computationally efficient, and provides accurate estimates of the marginals at intermediate and terminal times.
Researcher Affiliation Academia Ella Tamir EMAIL Department of Computer Science Aalto University
Pseudocode Yes We present a high-level description of the ISB steps in Alg. 1. Algorithm 1 The Iterative Smoothing Bridge
Open Source Code Yes A reference implementation of the ISB model can be found at https://github.com/Aalto ML/ iterative-smoothing-bridge.
Open Datasets Yes We assess the effectiveness of our method on synthetic and real-world data generation tasks... 2D toy experiments from scikit-learn... by adapting data from Ambrosini et al. (2014) and Pellegrino et al. (2015), we propose a simplified data set for geese migration in Europe (OIBMD: ornithologically implausible bird migration data; available in the supplement)... We modify the diffusion generative process of the MNIST (Le Cun et al., 1998) digit 8... Lastly, we evaluated our approach on an Embryoid body sc RNA-seq time course (Tong et al., 2020).
Dataset Splits No The paper uses datasets for generative tasks and observations/constraints rather than traditional machine learning train/test/validation splits. For example, for the single-cell embryo RNA-seq data: "used the first and last time ranges as the initial and terminal constraints. All other time ranges are considered observational data." This describes how data is used, but not conventional splits for model evaluation.
Hardware Specification Yes All low-dimensional (at most d = 5) experiments were run on a Mac Book Pro laptop CPU, whereas the image experiments used a single NVIDIA A100 GPU and ran for 5 h 10 min.
Software Dependencies No The paper thanks Adrien Corenflos for sharing an implementation of differentiable resampling in Py Torch, indicating PyTorch was used. However, specific version numbers for PyTorch or any other software dependencies are not provided in the text.
Experiment Setup Yes In all experiments, the forward and backward drift functions fθ and bφ are parametrized as neural networks... The latent state SDE was simulated by Euler Maruyama with a fixed time-step of 0.01 over 100 steps and 1000 particles if not otherwise stated... All three experiments had the same discretization (t [0, 0.99]), k = 0.01), learning rate 0.001, and differentiable resampling regularization parameter ε = 0.01. The process noise g(t)2 follows a linear schedule from 0.001 to 1... and each iteration of the ISB method trains the forward and backward drift networks each for 5000 iterations, with batch size 256. Other hyperparameters specific to each experiment are provided in Appendix B.