UniMuMo: Unified Text, Music, and Motion Generation
Authors: Han Yang, Kun Su, Yutong Zhang, Jiaben Chen, Kaizhi Qian, Gaowen Liu, Chuang Gan
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate that Uni Mu Mo achieves competitive results on all unidirectional generation benchmarks across music, motion, and text modalities. |
| Researcher Affiliation | Collaboration | 1The Chinese University of Hong Kong, 2University of Washington, 3The University of British Columbia 4University of Massachusetts Amherst, 5MIT-IBM Watson AI Lab, 6Cisco Research |
| Pseudocode | No | The paper describes the model architecture and pipeline in prose, for example: "Our pipeline consists of three main stages: a music-motion joint tokenizer that encodes music and motion sequences into discrete representations within the same space, a music-motion transformer-decoder model trained on the task of music-motion joint generation, and a music-motion captioner that generates text descriptions from music and motion features." It does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code https://github.com/hanyangclarence/Uni Mu Mo |
| Open Datasets | Yes | With the augmented synchronized music-motion data, we can utilize existing music and motion datasets to train our unified generative model... Music4All dataset... AIST++ dataset... Music QA dataset released by (Liu et al. 2023b)... Human ML3D test set. |
| Dataset Splits | No | The paper states: "More implementation details about hyperparameter choices, dataset, metrics and training/evaluation setups are in Appendix." While it mentions |
| Hardware Specification | No | The paper states: "More implementation details about hyperparameter choices, dataset, metrics and training/evaluation setups are in Appendix." However, no specific hardware details (like GPU or CPU models) are provided in the main text. |
| Software Dependencies | No | The paper mentions "Demucs (D efossez 2021; Rouard, Massa, and D efossez 2023)" as a tool used, but does not provide specific version numbers for it or any other software dependencies. It also states: "More implementation details about hyperparameter choices, dataset, metrics and training/evaluation setups are in Appendix." |
| Experiment Setup | Yes | Empirically, λ is set to 0.02... Empirically, µ is set to 0.85... More implementation details about hyperparameter choices, dataset, metrics and training/evaluation setups are in Appendix. |