OV-MER: Towards Open-Vocabulary Multimodal Emotion Recognition
Authors: Zheng Lian, Haiyang Sun, Licai Sun, Haoyu Chen, Lan Chen, Hao Gu, Zhuofan Wen, Shun Chen, Zhang Siyuan, Hailiang Yao, Bin Liu, Rui Liu, Shan Liang, Ya Li, Jiangyan Yi, Jianhua Tao
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Benchmark. We build zero-shot benchmarks for OV-MER through extensive experiments and detailed analysis. This task can serve as an important evaluation benchmark for multimodal LLMs (MLLMs), challenging their ability to integrate multimodal clues and capture subtle temporal variations in emotional expression. Experiments. Our intensive experimental results not only demonstrate the strength of our methods but also prove that OV-MER can effectively enhance the presentation ability of emotions and user experience. |
| Researcher Affiliation | Academia | 1Institute of Automation, Chinese Academy of Sciences 2Shanghai Jiao Tong University 3CMVS, University of Oulu 4Inner Mongolia University 5Xi an Jiaotong-Liverpool University 6Beijing University of Posts and Telecommunicationsy 7Department of Automation, Tsinghua University 8Beijing National Research Center for Information Science and Technology, Tsinghua University. |
| Pseudocode | No | The paper describes a model architecture with mathematical equations in Appendix U but does not present a structured pseudocode or algorithm block. For example: "hm i = Re LU f m i W h m + bh m , m {a, l, v}, (8) hi = Concat ha i , hl i, hv i , (9) αi = softmax h T i Wα + bα , (10) zi = hiαi. (11)" This is a mathematical description, not pseudocode. |
| Open Source Code | Yes | Code and dataset are available at: https://github.com/zero Qiaoba/Affect GPT. |
| Open Datasets | Yes | Code and dataset are available at: https://github.com/zero Qiaoba/Affect GPT. Ultimately, we create a dataset, OV-MERD, which offers a richer set of emotions compared to existing datasets (see Table 1). This dataset is an extension of MER2023 (Lian et al., 2023). |
| Dataset Splits | No | The paper states that OV-MERD is an extension of MER2023 from which a portion of samples were randomly selected for further annotation. However, it does not specify the train/validation/test splits for the OV-MERD dataset used in their experiments. For example: "We randomly selected a subset of MER2023 for further annotation to construct our OV-MERD dataset." |
| Hardware Specification | Yes | All models are implemented in Py Torch, and all inference processes are executed on a 32GB NVIDIA Tesla V100 GPU. |
| Software Dependencies | No | The paper mentions "All models are implemented in Py Torch" but does not specify a version number for Py Torch or any other key software dependencies with version numbers. |
| Experiment Setup | No | The paper discusses various baselines and evaluation metrics, but it does not provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs, optimizer settings) for training any of the models used in the experiments. It focuses on zero-shot evaluation and general model performance rather than detailed training configurations. |