Cross-Modal Alignment via Variational Copula Modelling

Authors: Feng Wu, Tsai Hor Chan, Fuying Wang, Guosheng Yin, Lequan Yu

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on public MIMIC datasets demonstrate the superior performance of our model over other competitors. [...] Empirical results on real multimodal MIMIC datasets demonstrate the good performance of our method and ablation analysis corroborates the effectiveness of copula in modality alignments and robustness to potential variations. [...] We evaluate the performance of CM2 using large-scale, real-world EHR datasets: MIMIC-III (Johnson et al., 2016), MIMIC-IV (Johnson et al., 2023), and MIMIC-CXR (Johnson et al., 2019).
Researcher Affiliation Academia 1School of Computing and Data Science, University of Hong Kong, Hong Kong, China. Correspondence to: Lequan Yu <EMAIL>.
Pseudocode Yes Algorithm 1 Sampling algorithm of our proposed method.
Open Source Code Yes The code is available at https: //github.com/HKU-Med AI/CMCM.
Open Datasets Yes Extensive experiments on public MIMIC datasets demonstrate the superior performance of our model over other competitors. [...] We evaluate the performance of CM2 using large-scale, real-world EHR datasets: MIMIC-III (Johnson et al., 2016), MIMIC-IV (Johnson et al., 2023), and MIMIC-CXR (Johnson et al., 2019).
Dataset Splits Yes We split the dataset into training, validation, and test sets in the ratio of 70 : 15 : 15, following the procedure in Harutyunyan et al. (2019). [...] The dataset is split into training, validation, and test sets in the ratio of 70 : 10 : 20, following Hayat et al. (2022). [...] We split the data into 4,287 training samples, 465 validation samples, and 1,179 test samples.
Hardware Specification Yes All experiments are conducted on a single RTX-3090 GPU.
Software Dependencies Yes CM2 is implemented in Python 3.11 using Py Torch 1.9.
Experiment Setup Yes The batch size is set to 32 for models trained on the MIMIC-IV & CXR datasets, and 16 for models trained on the MIMIC-III & NOTE datasets, except for Dr Fuse, which is trained with a batch size of 8. [...] The hyperparameter search space includes: Dropout ratio: {0, 0.1, 0.2, 0.3} Learning rate: {1 10 4, 5 10 5, 1 10 5} Number of Gaussian mixtures K: {1, 2, 3, 4, 5, 6} Temperature: {0.001, 0.005, 0.01, 0.05, 0.08} Regularization parameter λcop: {1 10 5, 5 10 6, 1 10 6}