reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity Refiner

Authors: Wenliang Zhao, Minglei Shi, Xumin Yu, Jie Zhou, Jiwen Lu

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform extensive experiments to evaluate our method. By applying Flow Turbo to different flow-based models, we obtain an acceleration ratio of 53.1% 58.3% on class-conditional generation and 29.8% 38.5% on text-to-image generation.
Researcher Affiliation	Academia	Wenliang Zhao Department of Automation Tsinghua University EMAIL Minglei Shi Department of Automation Tsinghua University EMAIL Xumin Yu Department of Automation Tsinghua University EMAIL Jie Zhou Department of Automation Tsinghua University EMAIL Jiwen Lu Department of Automation Tsinghua University EMAIL
Pseudocode	Yes	Algorithm 1 Heun s Method Sampler and Algorithm 2 Pseudo Corrector Sampler in Appendix B.
Open Source Code	Yes	Code is available at https://github.com/shiml20/Flow Turbo.
Open Datasets	Yes	For class-conditional image generation, we adopt a transformer-style flow-based model Si T-XL [24] pre-trained on Image Net 256 256. We use Image Net-1K [6]2 to train our velocity model. We use a subset of LAION [34]3 containing only 50K images to train our velocity model.
Dataset Splits	No	The paper mentions using 'MS COCO 2017 [16] validation set' for FID calculation but does not explicitly state the train/validation splits used for training its own models or components.
Hardware Specification	Yes	In both tasks, we use a single NVIDIA A800 GPU to train the velocity refiner and find it converges within 6 hours. We use a batch size of 8 on a single A800 GPU to measure the latency of each method.
Software Dependencies	No	Our code is implemented in Py Torch 6. (The '6' is a footnote, not a version. No specific PyTorch version or other library versions are mentioned.)
Experiment Setup	Yes	Following common practice [24, 30], we adopt a classifier-free guidance scale (CFG) of 1.5. During training, we randomly sample t (0, 0.12] and compute the training objectives in (13). We use Adam W [21] optimizer for all models. We use a constant learning rate of 5 10 5 and a batch size of 18 on a single A800 GPU. We use Adam W [21] optimizer with a learning rate of 2e-5 and weight decay of 0.0. We adopt a batch size of 16 and set the warming-up steps as 100. We also use a gradient clipping of 0.01 to stabilize training.