TrackGo: A Flexible and Efficient Method for Controllable Video Generation

Authors: Haitao Zhou, Chuang Wang, Rui Nie, Jinlin Liu, Dongdong Yu, Qian Yu, Changhu Wang

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experimental results demonstrate that our new approach, enhanced by the Track Adapter, achieves state-of-the-art performance on key metrics such as FVD, FID, and Obj MC scores. We conduct extensive experiments to validate our approach. The experimental results demonstrate that our model surpasses existing models in terms of video quality (FVD), image quality (FID), and motion faithfulness (Obj MC).
Researcher Affiliation Collaboration 1 Beihang University 2 AIsphere Tech EMAIL EMAIL
Pseudocode No The paper describes its methodology using textual explanations and mathematical equations (Eq 1-9), but it does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No Project Page https://zhtjtcz.github.io/Track Go-Page/. This is a project page, not an explicit statement of code release or a direct link to a code repository for the methodology described in the paper.
Open Datasets Yes Our test set comprised the VIPSeg validation set along with an additional 300-video subset from our internal validation dataset.
Dataset Splits Yes Following the experimental design, we further filtered the data to obtain a subset of about 110K videos as our final training dataset. Our test set comprised the VIPSeg validation set along with an additional 300-video subset from our internal validation dataset.
Hardware Specification Yes All experiments were conducted using Py Torch with 8 NVIDIA A100-80G GPUs.
Software Dependencies No All experiments were conducted using Py Torch with 8 NVIDIA A100-80G GPUs. Adam W (Loshchilov and Hutter 2017) is configured as our optimizer. The paper mentions 'Py Torch' but does not specify a version number or other software dependencies with version numbers.
Experiment Setup Yes Adam W (Loshchilov and Hutter 2017) is configured as our optimizer, running for a total of 18,000 training steps with a learning rate of 3e-5 and a batch size of 8.