Aligning Human Motion Generation with Human Perceptions
Authors: Haoru Wang, Wentao Zhu, Luyi Miao, Yishu Xu, Feng Gao, Qi Tian, Yizhou Wang
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate the effectiveness of our approach in both evaluating and improving the quality of generated human motions by aligning with human perceptions. |
| Researcher Affiliation | Collaboration | 1Center on Frontiers of Computing Studies, School of Compter Science, Peking University 2School of Arts, Peking University 3Inst. for Artificial Intelligence, Peking University 4Huawei Technologies, Ltd. 5Nat l Eng. Research Center of Visual Technology, Peking University 6State Key Laboratory of General Artificial Intelligence, Peking University |
| Pseudocode | Yes | Algorithm 1 Fine-tuning Motion Generation with Motion Critic |
| Open Source Code | Yes | Code and data are publicly available at https: //motioncritic.github.io/. |
| Open Datasets | Yes | We contribute Motion Percept, a large-scale motion perceptual evaluation dataset with manual annotations. Code and data are publicly available at https: //motioncritic.github.io/. |
| Dataset Splits | Yes | We train our critic model using the MDM subset in Motion Percept. We convert each multiple-choice question into three ordered preference pairs, which results in 46740 pairs for training and 5823 pairs for testing. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used to run its experiments, such as GPU models, CPU models, or memory details. |
| Software Dependencies | No | The paper mentions using DSTformer, MDM, and adapting NumPy to PyTorch, but does not provide specific version numbers for these or other software dependencies like Python or CUDA. |
| Experiment Setup | Yes | We train the critic model for 150 epochs with a batch size of 64 and a learning rate starting at 2e-3, decreasing with a 0.995 exponential learning rate decay. We fine-tune for 800 iterations, with a batch size of 64 and learning rate 1e-5. We fine-tune with critic clipping threshold τ = 12.0, critic re-weight scale λ =1e-3, and KL loss re-weight scale µ = 1.0. We set the step sampling range [T1, T2] = [700, 900]. |