reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Cross-Modal Alignment via Variational Copula Modelling

Authors: Feng Wu, Tsai Hor Chan, Fuying Wang, Guosheng Yin, Lequan Yu

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on public MIMIC datasets demonstrate the superior performance of our model over other competitors. [...] Empirical results on real multimodal MIMIC datasets demonstrate the good performance of our method and ablation analysis corroborates the effectiveness of copula in modality alignments and robustness to potential variations. [...] We evaluate the performance of CM2 using large-scale, real-world EHR datasets: MIMIC-III (Johnson et al., 2016), MIMIC-IV (Johnson et al., 2023), and MIMIC-CXR (Johnson et al., 2019).
Researcher Affiliation	Academia	1School of Computing and Data Science, University of Hong Kong, Hong Kong, China. Correspondence to: Lequan Yu <EMAIL>.
Pseudocode	Yes	Algorithm 1 Sampling algorithm of our proposed method.
Open Source Code	Yes	The code is available at https: //github.com/HKU-Med AI/CMCM.
Open Datasets	Yes	Extensive experiments on public MIMIC datasets demonstrate the superior performance of our model over other competitors. [...] We evaluate the performance of CM2 using large-scale, real-world EHR datasets: MIMIC-III (Johnson et al., 2016), MIMIC-IV (Johnson et al., 2023), and MIMIC-CXR (Johnson et al., 2019).
Dataset Splits	Yes	We split the dataset into training, validation, and test sets in the ratio of 70 : 15 : 15, following the procedure in Harutyunyan et al. (2019). [...] The dataset is split into training, validation, and test sets in the ratio of 70 : 10 : 20, following Hayat et al. (2022). [...] We split the data into 4,287 training samples, 465 validation samples, and 1,179 test samples.
Hardware Specification	Yes	All experiments are conducted on a single RTX-3090 GPU.
Software Dependencies	Yes	CM2 is implemented in Python 3.11 using Py Torch 1.9.
Experiment Setup	Yes	The batch size is set to 32 for models trained on the MIMIC-IV & CXR datasets, and 16 for models trained on the MIMIC-III & NOTE datasets, except for Dr Fuse, which is trained with a batch size of 8. [...] The hyperparameter search space includes: Dropout ratio: {0, 0.1, 0.2, 0.3} Learning rate: {1 10 4, 5 10 5, 1 10 5} Number of Gaussian mixtures K: {1, 2, 3, 4, 5, 6} Temperature: {0.001, 0.005, 0.01, 0.05, 0.08} Regularization parameter λcop: {1 10 5, 5 10 6, 1 10 6}