Understanding Emotional Body Expressions via Large Language Models

Authors: Haifeng Lu, Jiuyi Chen, Feng Liang, Mingkui Tan, Runhao Zeng, Xiping Hu

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate that our model achieves recognition accuracy comparable to existing methods. More importantly, with the support of background knowledge from LLMs, our model can generate detailed emotion descriptions based on classification results, even when trained on a limited amount of labeled skeleton data.
Researcher Affiliation Academia 1Guangdong-Hong Kong-Macao Joint Laboratory for Emotional Intelligence and Pervasive Computing, Shenzhen MSU-BIT University, 2Lanzhou University, 3South China University of Technology, 4Peng Cheng Laboratory, 5Shenzhen University, 6Beijing Institute of Technology EMAIL, EMAIL, EMAIL
Pseudocode No The paper describes the methodology using textual explanations and diagrams (Figure 1), but does not include any explicit pseudocode or algorithm blocks.
Open Source Code No The paper does not contain any explicit statements about releasing source code for the described methodology, nor does it provide any links to a code repository.
Open Datasets Yes Emilya The EMotional body expression In da ILY Actions dataset (Fourati and Pelachaud 2016) comprises 8,206 samples. ... KDAE The Kinematic Dataset of Actors Expressing Emotions (Zhang et al. 2020) was collected using a portable motion capture system... EGBM The Emotional Gestures and Body Movements Corpus (Sapi nski et al. 2019) contains 560 samples recorded by a Kinect V2 camera at 30 Hz.
Dataset Splits Yes We randomly split the dataset into training and testing sets at a 4:1 ratio.
Hardware Specification Yes All experiments are conducted on NVIDIA 4 A100 GPUs and are implemented using the Py Torch framework (Paszke et al. 2019).
Software Dependencies No All experiments are conducted on NVIDIA 4 A100 GPUs and are implemented using the Py Torch framework (Paszke et al. 2019). The paper mentions PyTorch but does not specify a version number or other key software components with versions.
Experiment Setup Yes Training For emotion description, the model is trained for 10,000 steps using labeled datasets. The Lo RA parameters are configured with a rank of 64, an alpha of 16, and a dropout rate of 0.05. The global batch size is 16, and the maximum learning rate is 1e-5. For emotion recognition, the model undergoes 800,000 training steps, with the global batch size increased to 64. The Lo RA parameters and maximum learning rate remain the same as in the emotion description stage. During skeleton encoder pre-training, we optimize the model for 200 epochs using the stochastic gradient descent (SGD) optimizer with an initial learning rate of 0.1 and a batch size of 64. The learning rate is reduced by a factor of 10 at epochs 100, 150, and 175. A warm-up strategy is applied for the first 5 epochs.