reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

EvSTVSR: Event Guided Space-Time Video Super-Resolution

Authors: Haojie Yan, Zhan Lu, Zehao Chen, De Ma, Huajin Tang, Qian Zheng, Gang Pan

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results demonstrate that our method not only outperforms existing RGB-based approaches but also excels in handling large motion scenarios.
Researcher Affiliation	Academia	1The State Key Lab of Brain-Machine Intelligence, Zhejiang University, China 2College of Computer Science and Technology, Zhejiang University, China 3 School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore
Pseudocode	No	The paper describes methods with formulas and block diagrams but does not contain a clearly labeled pseudocode or algorithm block.
Open Source Code	Yes	Code https://github.com/hjyyyd/Ev STVSR.
Open Datasets	Yes	Similar to previous methods that addressed the STVSR task, we followed the training and testing protocols of Video INR (Chen et al. 2022) to validate our approach on the Adobe240 (Su et al. 2017) and Go Pro (Nah, Hyun Kim, and Mu Lee 2017) datasets. Both datasets have a resolution of 1280 720 and a frame rate of 240 fps. We generated events between consecutive frames using vid2e (Gehrig et al. 2020) to simulate realistic event noise, showcasing our method s robustness to noise. The Adobe240 dataset includes 100 training, 16 validation, and 17 testing videos, while the Go Pro dataset contains 22 training and 11 testing videos. We trained our model on Adobe and tested it on both Adobe and Go Pro, following Video INR s approach. We used a sliding window of 9 frames, with the 1st and 9th frames, along with intermediate events, as inputs, down-sampled by a factor of 4. The high-resolution frames served as the ground truth. VFI and VSR Datasets. Since our method can independently perform both super-resolution and interpolation tasks, we conducted experiments on two real event datasets to validate their performance thoroughly. Specifically, we performed Video Frame Interpolation (VFI) experiments on the BS-ERGB dataset (Tulyakov et al. 2022). BS-ERGB is widely used for event-guided VFI tasks and is characterized by complex motions, including non-linear and large movements. We trained and tested our method on this dataset and compared the results with previous methods. Additionally, we performed video super-resolution(VSR) experiments on the CED dataset (Scheerlinck et al. 2019), and compared our results with those of prior approaches.
Dataset Splits	Yes	The Adobe240 dataset includes 100 training, 16 validation, and 17 testing videos, while the Go Pro dataset contains 22 training and 11 testing videos. We trained our model on Adobe and tested it on both Adobe and Go Pro, following Video INR s approach.
Hardware Specification	Yes	The experiments were executed on four NVIDIA RTX 3090 GPUs.
Software Dependencies	No	For all experiments, the Adam optimizer (Kingma 2014) was employed with hyperparameters β1 = 0.9 and β2 = 0.999. The paper mentions the Adam optimizer and RAFT optical flow model, but does not provide specific version numbers for software libraries or environments (e.g., Python, PyTorch, CUDA versions).
Experiment Setup	Yes	For all experiments, the Adam optimizer (Kingma 2014) was employed with hyperparameters β1 = 0.9 and β2 = 0.999. The initial learning rate was set at 4 10 4 and was systematically reduced to 1 10 7 through cosine annealing every 150k iterations. The training was conducted over 600k iterations with a batch size of 8. Data augmentation strategies, including random rotations and random cropping, were applied.