Understanding Emotional Body Expressions via Large Language Models
Authors: Haifeng Lu, Jiuyi Chen, Feng Liang, Mingkui Tan, Runhao Zeng, Xiping Hu
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that our model achieves recognition accuracy comparable to existing methods. More importantly, with the support of background knowledge from LLMs, our model can generate detailed emotion descriptions based on classification results, even when trained on a limited amount of labeled skeleton data. |
| Researcher Affiliation | Academia | 1Guangdong-Hong Kong-Macao Joint Laboratory for Emotional Intelligence and Pervasive Computing, Shenzhen MSU-BIT University, 2Lanzhou University, 3South China University of Technology, 4Peng Cheng Laboratory, 5Shenzhen University, 6Beijing Institute of Technology EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes the methodology using textual explanations and diagrams (Figure 1), but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code for the described methodology, nor does it provide any links to a code repository. |
| Open Datasets | Yes | Emilya The EMotional body expression In da ILY Actions dataset (Fourati and Pelachaud 2016) comprises 8,206 samples. ... KDAE The Kinematic Dataset of Actors Expressing Emotions (Zhang et al. 2020) was collected using a portable motion capture system... EGBM The Emotional Gestures and Body Movements Corpus (Sapi nski et al. 2019) contains 560 samples recorded by a Kinect V2 camera at 30 Hz. |
| Dataset Splits | Yes | We randomly split the dataset into training and testing sets at a 4:1 ratio. |
| Hardware Specification | Yes | All experiments are conducted on NVIDIA 4 A100 GPUs and are implemented using the Py Torch framework (Paszke et al. 2019). |
| Software Dependencies | No | All experiments are conducted on NVIDIA 4 A100 GPUs and are implemented using the Py Torch framework (Paszke et al. 2019). The paper mentions PyTorch but does not specify a version number or other key software components with versions. |
| Experiment Setup | Yes | Training For emotion description, the model is trained for 10,000 steps using labeled datasets. The Lo RA parameters are configured with a rank of 64, an alpha of 16, and a dropout rate of 0.05. The global batch size is 16, and the maximum learning rate is 1e-5. For emotion recognition, the model undergoes 800,000 training steps, with the global batch size increased to 64. The Lo RA parameters and maximum learning rate remain the same as in the emotion description stage. During skeleton encoder pre-training, we optimize the model for 200 epochs using the stochastic gradient descent (SGD) optimizer with an initial learning rate of 0.1 and a batch size of 64. The learning rate is reduced by a factor of 10 at epochs 100, 150, and 175. A warm-up strategy is applied for the first 5 epochs. |