A Survey on Future Frame Synthesis: Bridging Deterministic and Generative Approaches

Authors: Ruibo Ming, Zhewei Huang, Jingwei Wu, Zhuoxuan Ju, Daxin Jiang, Jianming HU, Lihui Peng, Shuchang Zhou

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical This survey provides a comprehensive analysis of the FFS landscape, charting its critical evolution from deterministic algorithms focused on pixel-level accuracy to modern generative paradigms that prioritize semantic coherence and dynamic plausibility. By pinpointing key challenges and proposing concrete research questions for both frontiers, this survey serves as an essential guide for researchers aiming to advance the frontiers of visual dynamic modeling.
Researcher Affiliation Collaboration Ruibo Ming EMAIL Tsinghua University, Step Fun; Zhewei Huang EMAIL Step Fun; Jingwei Wu EMAIL Step Fun; Zhuoxuan Ju EMAIL Peking University, Step Fun; Daxin Jiang EMAIL Step Fun; Jianming Hu EMAIL Tsinghua University; Lihui Peng EMAIL Tsinghua University; Shuchang Zhou EMAIL Step Fun, Megvii Technology
Pseudocode No The paper is a survey and does not present any novel algorithms with corresponding pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described in this survey. It mentions open-source models from other papers (e.g., Hunyuan Video, Step-Video-T2V, Seaweed-7B, MAGI-1) but not code for this survey itself.
Open Datasets Yes In Table 1, we summarize the most widely used datasets in video synthesis, highlighting their scale and available supervisory signals to provide a comprehensive overview of the current dataset landscape. Examples include KTH Action (Schuldt et al., 2004), Caltech Pedestrian (Dollar et al., 2011), Moving MNIST (Srivastava et al., 2015), and many others with corresponding citations.
Dataset Splits No The paper is a survey and discusses various datasets, but it does not perform its own experiments or specify training/test/validation splits needed to reproduce experiments. It reviews datasets used by other works.
Hardware Specification No The paper mentions 'prevailing GPU computing power' and 'significant growth in computational resources' in the context of surveyed models, but it does not provide specific hardware details (e.g., GPU models, CPU models) used for any experiments conducted by the authors of this survey.
Software Dependencies No The paper discusses various software components and models (e.g., CNNs, RNNs, Transformers, VAEs, GANs, diffusion models) used in the surveyed literature. However, it does not specify any particular software dependencies with version numbers for the work presented in this survey paper itself.
Experiment Setup No The paper is a survey and does not conduct its own experiments. Therefore, it does not describe any experimental setup details such as hyperparameters or training configurations.