VE-Bench: Subjective-Aligned Benchmark Suite for Text-Driven Video Editing Quality Assessment
Authors: Shangkun Sun, Xiaoyu Liang, Songlin Fan, Wenxu Gao, Wei Gao
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This suite includes VEBench DB, a video quality assessment (VQA) database for video editing. VE-Bench DB encompasses a diverse set of source videos featuring various motions and subjects, along with multiple distinct editing prompts, editing results from 8 different models, and the corresponding Mean Opinion Scores (MOS) from 24 human annotators. Based on VEBench DB, we further propose VE-Bench QA, a quantitative human-aligned measurement for the text-driven video editing task. ... Detailed experiments demonstrate that VE-Bench QA achieves state-of-the-art alignment with human preferences, surpassing existing advanced metrics and VQA methods. |
| Researcher Affiliation | Academia | 1Guangdong Provincial Key Laboratory of Ultra High Definition Immersive Media Technology, SECE, Shenzhen Graduate School, Peking University, 2Peng Cheng Laboratory EMAIL, EMAIL |
| Pseudocode | No | The paper describes methods and network architecture in text and diagrams (e.g., Figure 6), but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code https://github.com/littlespray/VE-Bench |
| Open Datasets | Yes | Datasets https://openi.pcl.ac.cn/Open Datasets |
| Dataset Splits | Yes | Following the 10-fold method (Kou et al. 2023; Wu et al. 2023a; Sun et al. 2022), all models are trained with the initial learning rate of 1e 3 and the batch size of 8 on VE-Bench DB for 60 epochs. |
| Hardware Specification | Yes | We build all models via Py Torch and train them via NVIDIA V100 GPUs. |
| Software Dependencies | No | We build all models via Py Torch and train them via NVIDIA V100 GPUs. The paper mentions PyTorch but does not specify a version number. |
| Experiment Setup | Yes | all models are trained with the initial learning rate of 1e 3 and the batch size of 8 on VE-Bench DB for 60 epochs. Following DOVER (Wu et al. 2023a), we first fine-tuning the head for 40 epochs with linear probing, and then train all parameters for another 20 epochs. Adam (Kingma and Ba 2014) optimizer and a cosine scheduler are applied during training. |