reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Video Repurposing from User Generated Content: A Large-scale Dataset and Benchmark

Authors: Yongliang Wu, Wenbo Zhu, Jiawang Cao, Yi Lu, Bozheng Li, Weiheng Chi, Zihan Qiu, Lirian Su, Haolin Zheng, Jay Wu, Xu Yang

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through comprehensive experimental analysis and comparison with other state-of-the-art video highlight detection models, we demonstrate the superior performance and practical applicability of our proposed model for this task. The paper includes sections like "Experiments Setting," "Results and Analysis," and various ablation studies.
Researcher Affiliation	Collaboration	The authors are affiliated with "1Southeast University," "2Opus AI Research," "3University of Toronto," "4Brown University," and "5National University of Singapore." This mix of universities (academic) and a private company (Opus AI Research) indicates a collaborative affiliation type.
Pseudocode	No	No explicit pseudocode or algorithm blocks were found in the paper. The methodology is described in prose and mathematical formulations.
Open Source Code	Yes	Code https://github.com/yongliang-wu/Repurpose
Open Datasets	Yes	We introduce Repurpose-10K, a large-scale dataset speciﬁcally curated for the video repurposing task. ... Code https://github.com/yongliang-wu/Repurpose. Additionally, the PANN model is trained on Audio Set (Gemmeke et al. 2017).
Dataset Splits	Yes	We partition the dataset into train/val/test splits at a ratio of 8/1/1.
Hardware Specification	Yes	All experiments are conducted on two A100 GPUs within the Py Torch framework.
Software Dependencies	No	The paper mentions "Py Torch framework" but does not specify a version number for it or other key software dependencies (e.g., Python, CUDA) required for replication. It mentions models like Whisper X, CLIP ViT-B/32, PANN, and all-Mini LM-L6-v2 but without their specific library versions.
Experiment Setup	Yes	The embedding dimension of the model is set to d = 512, and the number of layers Ns, Nc, and Nf is set to 3. We utilize the Adam optimizer with a learning rate of 1e-4 for 100 epochs, which is adjusted using cosine learning rate decay, while the ﬁrst 5 epochs employ linear warm-up to facilitate stable learning. The hyper-parameters λ1 4 are set to 0.1, 0.3, 0.1, and 0.7, respectively.