MATCH: Modality-Calibrated Hypergraph Fusion Network for Conversational Emotion Recognition

Authors: Jiandong Shi, Ming Li, Lu Bai, Feilong Cao, Ke Lu, Jiye Liang

IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments demonstrate that MATCH outperforms state-of-the-art approaches on two benchmark datasets. ... 4 Experiments 4.1 Datasets 4.2 Baselines 4.3 Implementation Details 4.4 Results and Discussion 4.5 Ablation Study
Researcher Affiliation Academia 1School of Computer Science and Technology, Zhejiang Normal University 2Zhejiang Key Laboratory of Intelligent Education Technology and Application Zhejiang Normal University 3Zhejiang Institute of Optoelectronics 4School of Artificial Intelligence, Beijing Normal University 5School of Mathematical Sciences, Zhejiang Normal University 6School of Engineering Science, University of Chinese Academy of Sciences 7Peng Cheng Laboratory 8Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, the School of Computer and Information Technology, Shanxi University EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode No The paper describes its methods using prose and mathematical equations in Section 3 Methodology, but does not include any explicit pseudocode or algorithm blocks.
Open Source Code No The paper does not contain an explicit statement about the release of its own source code or a link to a code repository. The note 'denotes source code available' in Table 3 refers to baseline models, not the proposed MATCH framework.
Open Datasets Yes We evaluate the MATCH on two benchmark datasets, IEMOCAP and MELD. ... IEMOCAP [Busso et al., 2008] ... MELD [Poria et al., 2019]
Dataset Splits Yes IEMOCAP [Busso et al., 2008] contains video data from dyadic conversations with ten speakers, with utterances classified into six emotion categories: Happy, Sad, Neutral, Angry, Excited, Frustrated. Following [Hu et al., 2021], we use the first four sessions for training, the last for testing, and randomly select 10% of the training set for validation. MELD [Poria et al., 2019] consists of video data from multi-party conversations in TV show Friends , with utterances classified into seven emotion categories, i.e., Neutral, Surprise, Fear, Sadness, Joy, Disgust, Anger. We use the predefined splits for training and evaluation. ... Table 1: Statistics of the two benchmark datasets.
Hardware Specification Yes All experiments were conducted on an NVIDIA RTX A6000 GPU using the torch-geometric package.
Software Dependencies No All experiments were conducted on an NVIDIA RTX A6000 GPU using the torch-geometric package. The paper does not provide version numbers for `torch-geometric` or any other software dependencies.
Experiment Setup Yes We used a batch size of 16 for both datasets. For IEMOCAP, the learning rate was set to 1e-4 with a dropout rate of 0.4, while for MELD, the learning rate was set to 5e-4 with a dropout rate of 0.3. Additional parameter settings are provided in Table 2. ... Table 2: Hyperparameter settings on two datasets.