A Multi-Focus-Driven Multi-Branch Network for Robust Multimodal Sentiment Analysis

Authors: Chuanqi Tao, Jiaming Li, Tianzi Zang, Peng Gao

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments performed on two benchmarks show that our approach obtains results superior to state-of-the-art models. We conduct extensive experiments on the CMU-MOSI and CMU-MOSEI datasets under both complete and missing modalities conditions, and we gain superior results to the state-of-the-art models.
Researcher Affiliation Academia College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China EMAIL
Pseudocode No The paper describes the methodology using mathematical equations and textual descriptions of the model architecture and processes, but it does not include a clearly labeled 'Pseudocode' or 'Algorithm' block, nor a structured procedure formatted like code.
Open Source Code Yes Code https://github.com/MFMB-Net/MFMB-Net
Open Datasets Yes We conduct experiments on public multimodal sentiment analysis benchmark datasets. The basic statistics of each dataset are shown in Table 1. CMU-MOSI (Zadeh et al. 2016) contains 2199 short monologue video clips... CMU-MOSEI (Zadeh et al. 2018b) contains 22856 annotated video segments...
Dataset Splits Yes Datasets Train Val Test MOSI 552/53/679 92/13/124 379/30/277 MOSEI 4738/3540/8048 506/433/932 1350/1025/2284
Hardware Specification Yes The model is trained with a single Nvidia Ge Force RTX 3090 GPU.
Software Dependencies No The paper mentions using the Py Torch framework and tools like bert-base-uncased, COVAREP, and Facet, but does not provide specific version numbers for any of these software components.
Experiment Setup Yes For model training, we employ an Adam optimizer and adopt an early-stopping strategy with a patience of 6 epochs. In both datasets, the learning rate is set to 0.002 and the batch size is 24.