Video Diffusion Models: A Survey
Authors: Andrew Melnik, Michal Ljubljanac, Cong Lu, Qi Yan, Weiming Ren, Helge Ritter
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The paper extensively discusses 'Evaluation Metrics' (Section 6.3) and 'Benchmarks' (Section 6.4), and summarizes quantitative results from various models in 'Table 4: Video generation benchmarks', which includes performance metrics such as FID, FVD, IS, and CLIP-Sim. This analysis of empirical data and reported metrics from other studies constitutes data analysis. |
| Researcher Affiliation | Academia | Andrew Melnik EMAIL Bielefeld University Michal Ljubljanac EMAIL Bielefeld University Cong Lu EMAIL University of British Columbia Qi Yan EMAIL University of British Columbia Weiming Ren EMAIL University of Waterloo Helge Ritter EMAIL Bielefeld University |
| Pseudocode | No | The paper only describes steps in regular paragraph text and mathematical formulations (equations) without structured pseudocode or algorithm blocks. |
| Open Source Code | No | Website: https://github.com/ndrwmlnk/Awesome-Video-Diffusion-Models. This GitHub repository appears to be a curated list of resources related to video diffusion models, rather than the source code for the survey methodology described in this paper. |
| Open Datasets | Yes | Table 2 and Table 3 provide overviews of commonly used video and image datasets, respectively, including 'Web Vid-10M (Bain et al., 2021)', 'UCF101 (Soomro et al., 2012)', 'Image Net (Russakovsky et al., 2015)', and 'LAION-5B (Schuhmann et al., 2022)', all of which are established public datasets. |
| Dataset Splits | No | The paper, being a survey, discusses how other research papers utilize dataset splits for training and evaluation (e.g., 'Most often, the benchmarked models are either directly trained on the train split of the evaluation data set...'). However, it does not define dataset splits for any experimental methodology conducted within this specific paper. |
| Hardware Specification | No | The paper discusses general hardware limitations and mentions 'current graphics cards' and 'high-end GPUs' in the context of video diffusion models (Section 5.2, Section 13), but it does not specify any particular hardware used for its own research or analysis. |
| Software Dependencies | No | The paper references various models and frameworks used in video diffusion research, such as 'Stable Diffusion' (Rombach et al., 2022) and 'CLIP' (Radford et al., 2021), but it does not list specific software dependencies with version numbers for its own analysis or methodology. |
| Experiment Setup | No | The paper, being a survey, does not detail a specific experimental setup with hyperparameters or training configurations for its own research. It describes various methods and models from other works, including their training approaches, but does not present its own experimental parameters. |