reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

On the Guidance of Flow Matching

Authors: Ruiqi Feng, Chenglei Yu, Wenhao Deng, Peiyan Hu, Tailin Wu

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on synthetic datasets, image inverse problems, and offline reinforcement learning demonstrate the effectiveness of our proposed guidance methods and verify the correctness of our flow matching guidance framework. ... Empirical comparisons between guidance methods are conducted in different tasks, providing insights into choosing appropriate guidance methods for different generative modeling tasks.
Researcher Affiliation	Academia	1 Department of Artificial Intelligence, Westlake University, Hangzhou, China 2 Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China. Correspondence to: Tailin Wu <EMAIL>.
Pseudocode	Yes	The pseudocode for computing g MC t (xt) can be found in Algorithm 1, and a simplified version of g MC under the assumption of independent coupling is provided in Appendix A.8. ... Algorithm 1 Monte Carlo estimation of the guidance gt(xt) ... Algorithm 2 Monte Carlo estimation of the guidance gt(xt)
Open Source Code	Yes	Code to reproduce the experiments can be found at https://github. com/AI4Science-Westlake U/flow_ guidance.
Open Datasets	Yes	We report experiment results on the Locomotion tasks in the D4RL dataset (Fu et al., 2020)... We conduct experiments on the image inverse problems on the Celeb A-HQ (256 256) dataset...
Dataset Splits	Yes	For the Celeb A dataset, we employed a train-validation-test split of 8:1:1.
Hardware Specification	Yes	The run time was roughly 3 days on two H800 GPUs.
Software Dependencies	No	The paper describes model architectures (MLP, Transformer, U-Net) and mentions frameworks in the context of prior work (e.g.,
Experiment Setup	Yes	The model backbone is an MLP of 4 layers with a hidden dimension of 256. The models are trained 1e5 steps. ... a batch size of 32, a learning rate of 2e-4, and the cosine annealing learning rate scheduler. ... The value discount factor is set to 0.99 for all 3 datasets. We use a planning horizon of 20 steps and the planning stride 1. ... For deblurring, we apply a 61 61 Gaussian kernel with a standard deviation of σb = 1.0. For super-resolution, we perform 4 downsampling... In the case of box-inpainting, we use a centered 40 40 mask. Furthermore, for all three tasks, we add Gaussian noise after the degradation operation with a standard deviation of σ = 0.05 to the images.