reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Light-T2M: A Lightweight and Fast Model for Text-to-motion Generation

Authors: Ling-An Zeng, Guohong Huang, Gaojie Wu, Wei-Shi Zheng

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Compared to the state-of-the-art method, Mo Mask, our Light-T2M model features just 10% of the parameters (4.48M vs 44.85M) and achieves a 16% faster inference time (0.152s vs 0.180s), while surpassing Mo Mask with an FID of 0.040 (vs. 0.045) on Human ML3D dataset and 0.161 (vs. 0.228) on KIT-ML dataset. [...] 4 Experiments [...] 4.3 Comparison with State-of-the-arts [...] 4.4 Ablation Studies
Researcher Affiliation	Academia	Ling-An Zeng1, Guohong Huang1, Gaojie Wu1, Wei-Shi Zheng1 2* 1Sun Yat-sen University 2 Key Laboratory of Machine Intelligence and Advanced Computing, Ministry of Education, China EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes methods and architectures using figures and text but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	Yes	Code https://github.com/qinghuannn/light-t2m
Open Datasets	Yes	We conduct experiments on two most common public text-motion datasets, i.e., the Human ML3D dataset (Guo et al. 2022a) and the KIT-ML dataset (Plappert, Mandery, and Asfour 2016).
Dataset Splits	Yes	For both datasets, the preprocessing procedure and the train-test-validation split remain consistent with (Guo et al. 2022a).
Hardware Specification	Yes	Average Inference time (AIT) is calculated from the average across 100 samples using the same RTX 3090Ti GPU. [...] Our Light-T2M is optimized by Adamw (Loshchilov and Hutter 2019) with a learning rate of 2e-4, a cosine annealing schedule, and a batch size of 256 on 2 RTX 3090Ti GPUs.
Software Dependencies	No	The paper mentions several models and optimizers like Adamw, CLIP, Mamba, and Uni PC, but does not specify software environment dependencies such as Python, PyTorch/TensorFlow, or CUDA versions.
Experiment Setup	Yes	The max diffusion step T is 1000 and the linearly varying variances βt range from 10 4 to 10 2. During inference, we adopt Uni PC (Zhao et al. 2023) with 10 time steps for the fast sampling. The number of blocks N, the hidden dim D, and the downsampling factor S are 4, 256, and 8, respectively. The guidance scale s and the text dropout ratio τ are set to 4 and 0.2, respectively. Our Light-T2M is optimized by Adamw (Loshchilov and Hutter 2019) with a learning rate of 2e-4, a cosine annealing schedule, and a batch size of 256 on 2 RTX 3090Ti GPUs. Light-T2M is trained with 3000/5000 epochs on the Human ML3D/KIT-ML datasets.