I2VControl-Camera: Precise Video Camera Control with Adjustable Motion Strength
Authors: Wanquan Feng, Jiawei Liu, Pengqi Tu, Tianhao Qi, Mingzhen Sun, Tianxiang Ma, Songtao Zhao, SiYu Zhou, Qian HE
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on static and dynamic scenes show that our framework outperformances previous methods both quantitatively and qualitatively. |
| Researcher Affiliation | Collaboration | 1Byte Dance China 2University of Science and Technology of China (USTC) 3Institute of Automation, Chinese Academy of Sciences (CASIA) |
| Pseudocode | Yes | Algorithm 1: Static and dynamic region extraction based on trajectory analysis |
| Open Source Code | Yes | Project page: https://wanquanf.github.io/I2VControl Camera. |
| Open Datasets | Yes | Although previous methods trained on the Real Estate10K (Zhou et al., 2018) dataset for training... To compute FID, we randomly select 2000 video frames from Web Vid (Bain et al., 2021). |
| Dataset Splits | Yes | we collect a dataset of 30K video clips as our training set... The first testing set comprises 500 random static scene clips from Real Estate10K... The second testing set consists of 480 samples generated by text-to-image model... |
| Hardware Specification | Yes | We use 16 NVIDIA A100 GPUs to train them with a batch size 1 per GPU for 20K steps, taking about 36 hours. |
| Software Dependencies | No | The paper mentions using "Imageto-Video version of Magicvideo-V2" as a base model and "RAFT optical flow model" but does not provide specific version numbers for these or other software libraries/dependencies. |
| Experiment Setup | Yes | We set the frame number as 24 and the resolution as 704 448. We use 16 NVIDIA A100 GPUs to train them with a batch size 1 per GPU for 20K steps... During training, we fix the parameters of the base model and only train our adapter part. We adopt the same loss function and the same scheduler, with the sole modification being the introduction of the control signal condition. |