reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

OLMD: Orientation-aware Long-term Motion Decoupling for Continuous Sign Language Recognition

Authors: Yiheng Yu, Sheng Liu, Yuan Feng, Min Xu, Zhelun Jin, Xuhua Yang

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimentally, OLMD shows SOTA performance on three large-scale datasets: PHOENIX14, PHOENIX14-T, and CSL-Daily. Notably, we improve the word error rate (WER) on PHOENIX14 by an absolute 1.6% compared to the previous SOTA. Extensive experiments show our proposed OLMD outperforms all previous models on the three widely-used datasets: PHOENIX14 (Forster et al. 2015), PHOENIX14T (Camgoz et al. 2018), and CSL-Daily (Zhou et al. 2021). Fig. 1b highlights the excellent performance of OLMD on PHOENIX14.
Researcher Affiliation	Academia	Zhejiang University of Technology EMAIL
Pseudocode	No	The paper describes the methodology using mathematical equations and block diagrams (e.g., Figure 2 and Figure 3), but it does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain an explicit statement about releasing source code or provide a link to a code repository for the methodology described.
Open Datasets	Yes	In this paper, we mainly use three large CSLR datasets: PHOENIX14 (Forster et al. 2015) is a popular CSLR dataset from German TV weather reports with 1,295 words, recorded by 9 signers. PHOENIX14-T (Camgoz et al. 2018) is an expanded version of PHOENIX-2014, offering 7,096 training, 519 development (Dev), and 642 testing (Test) videos. CSL-Daily (Zhou et al. 2021) is a large-scale Chinese sign language dataset filmed by 10 different signers, covering a variety of daily life themes with over 20,000 sentences.
Dataset Splits	Yes	PHOENIX14 ... It includes 5,672 training, 540 development (Dev), and 629 testing (Test) videos. PHOENIX14-T ... offering 7,096 training, 519 development (Dev), and 642 testing (Test) videos. CSL-Daily ... The dataset is split into 18,401 training samples, 1,077 development samples, and 1,176 test samples, featuring 2,000 sign language and 2,343 Chinese text vocabularies.
Hardware Specification	Yes	Finally, all the training and testing are completed on 1 NVIDIA A6000 GPU.
Software Dependencies	No	The paper mentions models like ResNet34, 1D-CNNs, Bi LSTM, and optimizers like Adam, but does not provide specific version numbers for software libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages used for implementation.
Experiment Setup	Yes	During training, we set the batch size to 2 and the initial learning rate to 0.001, reducing to 30% at epochs 25 and 40. We default to using the Adam optimizer with a weight decay of 0.001, iterating for a total of 70 epochs. All input frames are first resized to 256x256 and then randomly cropped to 224x224 during training, with a 50% chance of horizontal flipping and a 20% probability of temporal scale adjustment. For inference, we simply use a central crop of 224x224.