Modular-Cam: Modular Dynamic Camera-view Video Generation with LLM

Authors: Zirui Pan, Xin Wang, Yipeng Zhang, Hong Chen, Kwan Man Cheng, Yaofei Wu, Wenwu Zhu

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive qualitative and quantitative experiments prove our proposed Modular-Cam s strong capability of generating multi-scene videos together with its ability to achieve fine-grained control of camera movements.
Researcher Affiliation Academia 1Department of Computer Science and Technology, Tsinghua University 2Beijing National Research Center for Information Science and Technology, Tsinghua University 3 Beijing University of Technology EMAIL, EMAIL, EMAIL,EMAIL
Pseudocode No The paper describes the methodology in prose and through diagrams (Figure 2) and mathematical equations (Equation 1-6), but does not contain any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No Generated results are available at https://modular-cam.github.io. This link points to a demonstration page for results, not explicitly to the source code for the methodology described in the paper. No other explicit statement or link for code release is found.
Open Datasets Yes We use the large-scale public video dataset Web Vid-10M (Bain et al. 2021) as our training set to train the newly inserted temporal transformer layers.
Dataset Splits No The paper mentions using Web Vid-10M as a training set and for selecting 100,000 videos for training, and a self-generated dataset of 1000 instructions for quantitative comparison. However, it does not provide specific details on how these datasets are split into training, validation, or test sets in a way that would allow for reproduction of the exact data partitioning.
Hardware Specification No The paper describes the experimental setup and training procedures but does not provide specific hardware details such as GPU models, CPU models, or memory specifications used for running the experiments.
Software Dependencies No The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers like Python 3.8, PyTorch 1.9, CUDA 11.1) needed to replicate the experiment.
Experiment Setup No The paper states that 'The whole training procedure can be found in the Appendix,' indicating that specific experimental setup details, such as hyperparameters or training configurations, are not provided in the main text.