Gaussian Mixture Flow Matching Models

Authors: Hansheng Chen, Kai Zhang, Hao Tan, Zexiang Xu, Fujun Luan, Leonidas Guibas, Gordon Wetzstein, Sai Bi

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that GMFlow consistently outperforms flow matching baselines in generation quality, achieving a Precision of 0.942 with only 6 sampling steps on Image Net 256 256. For evaluation, we compare GMFlow against vanilla flow matching baselines on both 2D toy dataset and Image Net (Deng et al., 2009). Extensive experiments reveal that GMFlow consistently outperforms baselines equipped with advanced solvers.
Researcher Affiliation Collaboration 1Stanford University, CA 94305, USA 2Adobe Research, CA 95110, USA 3Hillbot. Correspondence to: Hansheng Chen <EMAIL>.
Pseudocode Yes Algorithms 1 and 2 present the outlines of training and sampling schemes, respectively.
Open Source Code Yes https://github.com/Lakonik/GMFlow
Open Datasets Yes For evaluation, we compare GMFlow against vanilla flow matching baselines on both 2D toy dataset and Image Net (Deng et al., 2009). [...] Table 5 presents a quantitative comparison among GMFlow (K = 2), GMS (Guo et al., 2023), SN-DDPM (Bao et al., 2022a), and DDPM (Ho et al., 2020) for CIFAR-10 (Krizhevsky et al., 2009) unconditional image generation using SDE sampling.
Dataset Splits Yes For image generation evaluation, we benchmark GMFlow against vanilla flow baselines on class-conditioned Image Net 256 256. [...] The time-averaged NLL values are computed on 50K samples from the training dataset using the following equation.
Hardware Specification Yes We train both the baseline and GMFlow-Di T on Image Net 256 256 with a batch size of 4096 images across 16 A100 GPUs, using a total training schedule of 200K iterations.
Software Dependencies No The paper mentions "8-bit Adam W (Dettmers et al., 2022; Loshchilov & Hutter, 2019) optimizer" and "Diffusers implementations (von Platen et al., 2022)". However, it does not provide specific version numbers for software libraries like Python, PyTorch, CUDA, or the Diffusers library itself, only referencing papers for techniques or general implementations without specific versioning for the software environment.
Experiment Setup Yes We train both the baseline and GMFlow-Di T on Image Net 256 256 with a batch size of 4096 images across 16 A100 GPUs, using a total training schedule of 200K iterations. We adopt the 8-bit Adam W (Dettmers et al., 2022; Loshchilov & Hutter, 2019) optimizer with a fixed learning rate of 0.0002. Following Stable Diffusion 3 (Esser et al., 2024), both models sample t from a logit-normal distribution during training (Algorithm 1), which accelerates convergence.