reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Modalities Contribute Unequally: Enhancing Medical Multi-modal Learning through Adaptive Modality Token Re-balancing

Authors: Jie Peng, Jenna L. Ballard, Mohan Zhang, Sukwon Yun, Jiayi Xin, Qi Long, Yanyong Zhang, Tianlong Chen

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Comprehensive experiments on both medical and general multi-modal datasets demonstrate the effectiveness and generalizability of AMC. We demonstrate the effectiveness of AMC through extensive experiments on several real-world datasets, including the MIMIC-IV dataset, the Alzheimer’s Disease Neuroimaging Initiative (ADNI) dataset, and a subset of the TCGA benchmark covering five different cancer types.
Researcher Affiliation	Academia	1University of Science and Technology of China 2University of Pennsylvania 3University of North Carolina at Chapel Hill. Correspondence to: Yanyong Zhang <EMAIL>.
Pseudocode	No	The paper describes the operations and steps of AMC within the main text (e.g., Section 4.2 Modality Importance Calculation, Section 4.3 Customized Token Fusion) and through figures (e.g., Figure 2 and 3), but it does not include a distinct, labeled pseudocode or algorithm block.
Open Source Code	Yes	Code is available at https://github. com/Peng Jieb/amc.
Open Datasets	Yes	We demonstrate the effectiveness of AMC through extensive experiments on several real-world datasets, including the MIMIC-IV dataset, the Alzheimer’s Disease Neuroimaging Initiative (ADNI) dataset, and a subset of the TCGA benchmark covering five different cancer types. We select Enhanced Rico (ENRICO) dataset (Leiva et al., 2021) evaluates the generalizability of AMC.
Dataset Splits	Yes	For the dataset split, we use 70% for training, 15% for validation, and 15% for testing.
Hardware Specification	Yes	All experiments were conducted using RTX 3090 GPUs.
Software Dependencies	No	The paper describes the implementation and experimental setup but does not provide specific version numbers for software dependencies such as Python, PyTorch, or other libraries.
Experiment Setup	Yes	Setup. To ensure a fair comparison with baselines, we use the best hyper-parameter settings from the original papers. If these are not available, we conduct hyper-parameter searches, including learning rate, hidden dimension, and batch size, with ranges of [1e-3, 1e-4, 5e-5, 1e-5], [32, 64, 128], and [32, 64, 128], respectively. For our proposed method, we additionally search for the number of experts and the weights of LI, LT and the load balancing loss of SMo E, with ranges of [4, 8, 16], [1.0, 0.1], [1.0, 0.1], and [1.0, 0.1], respectively. The final hyper-parameter settings for AMC are in Appendix B.1. (Referring to Table 7): The hyper-parameter setup for AMC. ADNI MIMIC-IV TCGA ENRICO UCEC LUAD LGG BRCA BLCA Learning rate 1e-4 1e-3 1e-3 1e-3 1e-3 1e-3 1e-3 5e-3 # of Experts 8 8 8 8 8 8 8 8 Top-K 2 2 2 2 2 2 2 2 # of Transformer Layers 2 2 2 2 2 2 2 4 Training Epochs 30 100 30 30 30 30 30 100 Warm-up Epochs 5 10 5 5 5 5 5 5 Hidden dimension 64 64 128 64 64 64 64 128 Batch Size 32 64 64 64 64 64 64 128 # of Attention Heads 8 8 8 8 8 8 8 8