reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Variational Rectified Flow Matching

Authors: Pengsheng Guo, Alex Schwing

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show on synthetic data, MNIST, CIFAR-10, and Image Net that variational rectified flow matching leads to compelling results. We evaluate the efficacy of variational rectified flow matching and compare to the classic rectified flow (Lipman et al., 2023; Liu et al., 2023; Albergo & Vanden-Eijnden, 2023) across multiple datasets and model architectures. Our experiments show that variational rectified flow matching is able to capture the multi-modal velocity in the data-time space, leading to compelling evaluation results.
Researcher Affiliation	Industry	1Apple. Correspondence to: Pengsheng Guo <pengsheng EMAIL>.
Pseudocode	Yes	Algorithm 1 Variational Rectified Flow Matching Training Algorithm 2 Variational Rectified Flow Matching Inference
Open Source Code	No	The paper mentions publicly available codebases for baseline models (e.g., consistency flow matching) in the appendices but does not provide a statement or link for the authors' own implementation code for Variational Rectified Flow Matching.
Open Datasets	Yes	We show on synthetic data, MNIST, CIFAR-10, and Image Net that variational rectified flow matching leads to compelling results. On MNIST, it enables controllable image generation with improved quality. On CIFAR-10, our approach outperforms classic rectified flow matching across different integration steps. Lastly, on Image Net, our method consistently improves the FID score of Si T-XL (Ma et al., 2024).
Dataset Splits	No	The paper uses well-known datasets like MNIST, CIFAR-10, and ImageNet, which typically have standard splits. However, it does not explicitly state the specific training, validation, and test split percentages, sample counts, or reference citations for the splits used within the main text or appendices for its experiments.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies	No	The paper mentions optimizers like 'Adam W' and 'Adam' but does not specify any software versions for libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages used in the implementation.
Experiment Setup	Yes	For synthetic 1D experiments, ... The KL loss weight is 1.0. ... The Adam W optimizer is used with the default weight decay and a learning rate of 1 10 3, over a total of 20,000 training iterations. For MNIST: The KL loss weight is set to 1 10 3. ... Adam W optimizer with a learning rate of 1 10 3 and a batch size of 256. ... trained jointly for 100,000 iterations. For CIFAR-10: Adam optimizer with a learning rate of 2 10 4 and a batch size of 128. Trained jointly for 600,000 iterations. KL loss weighting is presented alongside the results in Table 3. For ImageNet: Adam W optimizer with a learning rate of 1 10 4 and a global batch size of 256. Trained jointly for 800,000 iterations. The KL loss weight is set to 2 10 3. ... Euler-Maruyama sampler with the SDE solver and 250 integration steps.