reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Looking Backward: Streaming Video-to-Video Translation with Feature Banks

Authors: Feng Liang, Akio Kodaira, Chenfeng Xu, Masayoshi Tomizuka, Kurt Keutzer, Diana Marculescu

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our method with quantitative metrics, such as CLIP score (Radford et al., 2021) and warp error (Lai et al., 2018), and a user study. Our findings indicate that users significantly favor our Stream V2V over Stream Diffusion (Kodaira et al., 2023) (with over 70% win rates) and Co De F (Ouyang et al., 2023) (with over 80% win rates).
Researcher Affiliation	Academia	1 UT Austin 2 UC Berkeley EMAIL, EMAIL
Pseudocode	Yes	A.5.2 PSEUDO CODE OF DYNAMIC MERGING 1 import torch 2 import torch.nn.functional as F 4 def dynamic_merge(current_frame, feature_bank):
Open Source Code	Yes	Demo, code, and models are available on the project page. https://jeff-liangf.github.io/projects/streamv2v
Open Datasets	Yes	Following Token Flow (Geyer et al., 2023) and Flow Vid (Liang et al., 2023), we build our user study by selecting 19 object-centric videos from the DAVIS trainval 2017 dataset (Pont-Tuset et al., 2017), covering diverse subjects such as humans and animals.
Dataset Splits	No	Following Token Flow (Geyer et al., 2023) and Flow Vid (Liang et al., 2023), we build our user study by selecting 19 object-centric videos from the DAVIS trainval 2017 dataset (Pont-Tuset et al., 2017)...
Hardware Specification	Yes	Stream V2V can run 20 FPS on one A100 GPU, being 15 , 46 , 108 , and 158 faster than Flow Vid, Co De F, Rerender, and Token Flow, respectively.
Software Dependencies	No	We built our method on Stream Diffusion (Kodaira et al., 2023) with Latent Consistency Model (Luo et al., 2023b). By default, we use a 4-step LCM without the classifier-free guidance (Ho & Salimans, 2022). We continue to use x Formers (Lefaudeux et al., 2022) for fair comparison with existing methods.
Experiment Setup	Yes	By default, we use a 4-step LCM without the classifier-free guidance (Ho & Salimans, 2022). We update the feature bank every 4 frames. The underlying image-to-image method is SDEdit (Meng et al., 2021), with an initial noise strength of 0.4.