3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation
Authors: Xiao Fu, Xian Liu, Xintao WANG, Sida Peng, Menghan Xia, Xiaoyu Shi, Ziyang Yuan, Pengfei Wan, Di ZHANG, Dahua Lin
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show that 3DTraj Master sets a new state-of-the-art in both accuracy and generalization for controlling multi-entity 3D motions. Project page: http://fuxiao0719.github.io/projects/3dtrajmaster. |
| Researcher Affiliation | Collaboration | 1The Chinese University of Hong Kong 2Kuaishou Technology 3Zhejiang University |
| Pseudocode | Yes | Algorithm 1 Annealed conditional sampling with classifier-free guidance (CFG) |
| Open Source Code | No | The paper provides a project page: http://fuxiao0719.github.io/projects/3dtrajmaster. However, this is described as a project page, which typically contains an overview or demonstration, rather than a direct link to a specific source code repository as required by the instructions for a 'Yes' classification. |
| Open Datasets | No | To address the lack of suitable training data, we construct a 360◦-Motion Dataset, which first correlates collected 3D human and animal assets with GPT-generated trajectory and then captures their motion with 12 evenly-surround cameras on diverse 3D UE platforms. ... To circumvent the aforementioned challenges, we opt to construct a synthetic dataset, named 360◦Motion, through Unreal Engine (UE) with advanced rendering technologies (see Fig. 3). |
| Dataset Splits | No | The paper describes the construction of a custom dataset and how evaluation data was generated ('We collect 44 novel pose templates...', 'randomly assigned to poses to form 100 pairs'). However, it does not explicitly detail training, validation, or test splits for the 360-Motion Dataset used for training the model. |
| Hardware Specification | Yes | We utilize the Adam optimizer and train on a cluster of 8 NVIDIA H800 GPUs, with a learning rate of 5 x 10^-5 and a batch size of 8. |
| Software Dependencies | No | The paper mentions using the Adam optimizer and an internal video diffusion model but does not specify any software libraries (e.g., PyTorch, TensorFlow) or their version numbers. |
| Experiment Setup | Yes | We utilize the Adam optimizer and train on a cluster of 8 NVIDIA H800 GPUs, with a learning rate of 5 x 10^-5 and a batch size of 8. The training process consisted of 50,000 steps for the domain adaptor and an additional 36,000 steps for the object injector. During inference, we set the DDIM steps as 50 and the CFG as 12.5. |