reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Simple ReFlow: Improved Techniques for Fast Flow Models

Authors: Beomsu Kim, Yu-Guan Hsieh, Michal Klein, marco cuturi, Jong Chul YE, Bahjat Kawar, James Thornton

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To mitigate sample deterioration, we examine the design space of Re Flow and highlight potential pitfalls in prior heuristic practices. We then propose seven improvements for training dynamics, learning and inference, which are verified with thorough ablation studies on CIFAR10 32 32, AFHQv2 64 64, and FFHQ 64 64. Combining all our techniques, we achieve state-of-the-art FID scores (without / with guidance, resp.) for fast generation via neural ODEs: 2.23 / 1.98 on CIFAR10, 2.30 / 1.91 on AFHQv2, 2.84 / 2.67 on FFHQ, and 3.49 / 1.74 on Image Net-64, all with merely 9 neural function evaluations.
Researcher Affiliation	Collaboration	Beomsu Kim Apple and KAIST Yu-Guan Hsieh Apple Michal Klein Apple Marco Cuturi Apple Jong Chul Ye KAIST Bahjat Kawar Apple James Thornton Apple
Pseudocode	Yes	Algorithm 1 Coupling projection given λ Algorithm 2 Coupling projection
Open Source Code	No	The paper does not contain an explicit statement about releasing their source code, nor does it provide a direct link to a code repository for the methodology described.
Open Datasets	Yes	For each proposed improvement, we verify its effect on sample quality via extensive ablations on three datasets: CIFAR10 32 32 (Krizhevsky, 2009), FFHQ 64 64 (Karras et al., 2019), and AFHQv2 64 64 (Choi et al., 2020).
Dataset Splits	Yes	We sample 1M pairs from Q1 01 by solving Eq. (6) from t = 1 to 0 with EDM models and use them for training. We measure the performance of an optimized model by computing the FID (Heusel et al., 2017) between 50k generated images and all available dataset images. ... Num. Backward Pairs 1M 1M 1M 8M Num. Forward Pairs 50k 13.5k 70k 4M
Hardware Specification	No	The paper does not mention specific hardware used for running the experiments (e.g., specific GPU models, CPU models, or cloud configurations).
Software Dependencies	No	The paper mentions using the 'OTT-JAX library (Cuturi et al., 2022)' but does not provide specific version numbers for this or any other software dependencies.
Experiment Setup	Yes	Specific optimization hyper-parameters are reported in Tab. 9. ... Table 9: Training hyper-parameters. Iterations 200k 200k 200k 500k Minibatch Size 512 256 256 1024 Adam LR 2e 4 2e 4 2e 4 2e 4 Label dropout 0.1 EMA 0.9999 0.9999 0.9999 0.9999 Num. Backward Pairs 1M 1M 1M 8M Num. Forward Pairs 50k 13.5k 70k 4M. ... D.1.1 BEST SETTINGS: CIFAR10. High-pass filter λ = 10, dropout probability 0.09, forward pairs with mixing ratio ρ = 0.4, sigmoid discretization with κ = 20, DPM-solver r = 0.4, Auto Guidance scale w = 0.6.