reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Multi-to-Single: Reducing Multimodal Dependency in Emotion Recognition Through Contrastive Learning

Authors: Yan-Kai Liu, Jinyu Cai, Bao-Liang Lu, Wei-Long Zheng

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on five public multimodal emotion datasets demonstrate that our model achieves the state-of-the-art performance in the cross-modal tasks and maintains multimodal performance using only a single modality.
Researcher Affiliation	Academia	Yan-Kai Liu, Jinyu Cai, Bao-Liang Lu, Wei-Long Zheng* Shanghai Jiao Tong University EMAIL
Pseudocode	Yes	Algorithm 1: Pre-training Phase of M2S Model Input: Unlabeled and paired data XA, XB from modalities A and B Hyper Parameter: Learning rate and weight decay Output: Pre-trained encoders 1: Input XA, XB into their corresponding encoders: z A = ERA(XA), z A = EIA(XA); z B = ERB(XB), z B = EIB(XB). 2: Compute LA CLUB(z A, z A) and LB CLUB(z B, z B). 3: Compute LA Recon and LB Recon. 4: Input z A and z B into M2M CPC module. 5: Compute LA2B CP C, LB2A CP C, LA2A CP C, and LB2B CP C. 6: Flatten z A and z B, and project them into a new space. 7: Compute LContra. 8: Optimize the final loss: L = α LCLUB + β LRecon + γ LContra + λ LCP C. 9: return Encoders ERA and ERB.
Open Source Code	Yes	Code https://github.com/Arcee-LYK/Multi-to-Single.
Open Datasets	Yes	In the experiment, we use five public multimodal emotion datasets: SEED (Duan, Zhu, and Lu 2013; Zheng and Lu 2015), SEED-IV (Zheng et al. 2018), SEED-V (Liu et al. 2021), DEAP (Koelstra et al. 2011), and DREAMER (Katsigiannis and Ramzan 2017).
Dataset Splits	Yes	For the division of the training and testing sets, due to the fixed emotional labels corresponding to each video clip in the SEED series datasets, we divide the SEED, SEED-IV, and SEED-V datasets in ratios of 9:6, 16:8, and 10:5, respectively. The labels in the DEAP and DREAMER datasets are the scores of subjects on certain evaluation metrics, including valence, arousal, and dominance. This label leads to an uneven distribution of data, so we conduct four-fold and three-fold cross-validation on the DEAP and DREAMER, respectively. Each fold s training and testing ratios are 3:1 and 2:1, respectively.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, PyTorch 1.9, CUDA 11.1) needed to replicate the experiment.
Experiment Setup	No	The paper mentions a "pre-training phase" and "fine-tuning stage" but does not specify concrete hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed training configurations within the main text.