Improving and generalizing flow-based generative models with minibatch optimal transport
Authors: Alexander Tong, Kilian FATRAS, Nikolay Malkin, Guillaume Huguet, Yanlei Zhang, Jarrid Rector-Brooks, Guy Wolf, Yoshua Bengio
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate CFM and OT-CFM in experiments on single-cell dynamics, image generation, unsupervised image translation, and energy-based models. We show that the OT-CFM objective leads to more efficient training and decreases inference time while finding better approximate solutions to the dynamic OT and Schrödinger bridge problems. |
| Researcher Affiliation | Academia | Alexander Tong EMAIL Mila Québec AI Institute, Université de Montréal Kilian Fatras EMAIL Mila Québec AI Institute, Mc Gill University Nikolay Malkin EMAIL Mila Québec AI Institute, Université de Montréal |
| Pseudocode | Yes | Algorithm 1 Conditional Flow Matching Algorithm 2 Simplified Conditional Flow Matching (I-CFM) Algorithm 3 Minibatch OT Conditional Flow Matching (OT-CFM) Algorithm 4 Minibatch Schrödinger Bridge Conditional Flow Matching (SB-CFM) |
| Open Source Code | Yes | The Python code is available at https://github.com/atong01/conditional-flow-matching. |
| Open Datasets | Yes | We perform an experiment on unconditional CIFAR-10 generation from a Gaussian source... We show how CFM can be used to learn a mapping between two unpaired datasets in high-dimensional space using the Celeb A dataset (Liu et al., 2015; Sun et al., 2014)... We repurpose the CITE-seq and Multiome datasets from a recent NeurIPS competition for this task (Burkhardt et al., 2022). We also include the Embryoid body data from Moon et al. (2019); Tong et al. (2020). The 10-dimensional funnel dataset from Hoffman & Gelman (2011). |
| Dataset Splits | Yes | In this task we use leave-one-out validation over the timepoints. From times data at times [0, t 1], [t+1, T] we try to interpolate its distribution at time t following the setup of Schiebinger et al. (2019); Tong et al. (2020); Huguet et al. (2022a). |
| Hardware Specification | Yes | All experiments were performed on a shared heterogenous high-performance-computing cluster. This cluster is primarily composed of GPU nodes with RTX8000, A100, and V100 Nvidia GPUs... a single A100 GPU |
| Software Dependencies | No | For all experiments we use the same architecture implemented in PyTorch (Paszke et al., 2019)... We use the Adam W (Loshchilov & Hutter, 2019) optimizer... For OT-CFM and SB-CFM we use exact linear programming EMD and Sinkhorn algorithms from the python optimal transport package (Flamary et al., 2021)... For sampling, we use Euler integration using the torchdyn package and dopri5 from the torchdiffeq package. |
| Experiment Setup | Yes | For all 2D and single-cell experiments we train for 1000 epochs and implement early stopping on the validation loss which checks the loss on a validation set every 10 epochs and stops training if there is no improvement for 30 epochs... We use the Adam W (Loshchilov & Hutter, 2019) optimizer with weight decay 10 5 with batchsize 512 by default in 2D experiments and 128 in the single cell datasets... The main differences with Lipman et al. (2023) are that we use a constant learning rate, set to 2 10 4... we clip the gradient norm to 1 and rely on exponential moving average with a decay of 0.9999. Furthermore, our batch size was 128 instead of 256 |