VE-Bench: Subjective-Aligned Benchmark Suite for Text-Driven Video Editing Quality Assessment

Authors: Shangkun Sun, Xiaoyu Liang, Songlin Fan, Wenxu Gao, Wei Gao

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This suite includes VEBench DB, a video quality assessment (VQA) database for video editing. VE-Bench DB encompasses a diverse set of source videos featuring various motions and subjects, along with multiple distinct editing prompts, editing results from 8 different models, and the corresponding Mean Opinion Scores (MOS) from 24 human annotators. Based on VEBench DB, we further propose VE-Bench QA, a quantitative human-aligned measurement for the text-driven video editing task. ... Detailed experiments demonstrate that VE-Bench QA achieves state-of-the-art alignment with human preferences, surpassing existing advanced metrics and VQA methods.
Researcher Affiliation Academia 1Guangdong Provincial Key Laboratory of Ultra High Definition Immersive Media Technology, SECE, Shenzhen Graduate School, Peking University, 2Peng Cheng Laboratory EMAIL, EMAIL
Pseudocode No The paper describes methods and network architecture in text and diagrams (e.g., Figure 6), but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code https://github.com/littlespray/VE-Bench
Open Datasets Yes Datasets https://openi.pcl.ac.cn/Open Datasets
Dataset Splits Yes Following the 10-fold method (Kou et al. 2023; Wu et al. 2023a; Sun et al. 2022), all models are trained with the initial learning rate of 1e 3 and the batch size of 8 on VE-Bench DB for 60 epochs.
Hardware Specification Yes We build all models via Py Torch and train them via NVIDIA V100 GPUs.
Software Dependencies No We build all models via Py Torch and train them via NVIDIA V100 GPUs. The paper mentions PyTorch but does not specify a version number.
Experiment Setup Yes all models are trained with the initial learning rate of 1e 3 and the batch size of 8 on VE-Bench DB for 60 epochs. Following DOVER (Wu et al. 2023a), we first fine-tuning the head for 40 epochs with linear probing, and then train all parameters for another 20 epochs. Adam (Kingma and Ba 2014) optimizer and a cosine scheduler are applied during training.