reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

An Information Criterion for Controlled Disentanglement of Multimodal Data

Authors: Chenyu Wang, Sharut Gupta, Xinyi Zhang, Sana Tonekaboni, Stefanie Jegelka, Tommi Jaakkola, Caroline Uhler

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, we demonstrate that DISENTANGLEDSSL successfully achieves both distinct coverage and disentanglement for representations on a suite of synthetic datasets and multiple real-world multimodal datasets. It consistently outperforms baselines on prediction tasks for vision-language data, as well as molecule-phenotype retrieval tasks for biological data. We conduct a simulation study and two real-world multimodal experiments to evaluate the efficacy of our proposed DISENTANGLEDSSL.
Researcher Affiliation	Academia	Chenyu Wang 1,2, Sharut Gupta 1, Xinyi Zhang1,2, Sana Tonekaboni2, Stefanie Jegelka1,3, Tommi Jaakkola1, Caroline Uhler1,2 1MIT 2Broad Institute of MIT and Harvard 3TU Munich
Pseudocode	Yes	We introduce a two-step training procedure. The first step focuses on optimizing the shared latent representation, ensuring it captures the minimum necessary information as close as possible. Building upon this, the second step utilizes the learned shared representations in step 1 to facilitate the learning of modality-specific representations. This sequential approach is formalized in the optimization objectives given in Equations 3 and 4, with the pseudocode provided in Appendix I.
Open Source Code	Yes	The code is available at https://github.com/uhlerlab/Disentangled SSL.
Open Datasets	Yes	Empirically, we demonstrate that DISENTANGLEDSSL successfully achieves both distinct coverage and disentanglement for representations on a suite of synthetic datasets and multiple real-world multimodal datasets... We utilize the real-world multimodal benchmark from Multi Bench (Liang et al., 2021)... We use two high-content drug screening datasets which provide phenotypic profiles after drug perturbation: RXRX19a (Subramanian et al., 2017) containing cell imaging profiles, and LINCS (Cuccarese et al., 2020) containing L1000 gene expression profiles.
Dataset Splits	Yes	We follow the same setting (dataset splitting, encoder architecture, pre-extracted features) as in Liang et al. (2024). We conduct train-validation-test splitting according to molecules.
Hardware Specification	Yes	Each experiment was conducted on 1 NVIDIA RTX A5000 GPU, each with 24GB of accelerator RAM.
Software Dependencies	No	All experiments were implemented using the Py Torch deep learning framework. We utilize Mol2vec (Jaeger et al., 2018) to featurize the molecular structures into 300-dimensional feature vectors. The paper does not provide specific version numbers for PyTorch or Mol2vec.
Experiment Setup	Yes	We assess the performance of the learned shared and modality-specific representations for different values of β and λ, as shown in Figure 4. For DISENTANGLEDSSL, we use β {0.0, 0.001, 0.01, 0.1, 0.5, 1.0, 5.0, 10.0, 50.0, 100.0, 300.0, 500.0, 1000.0} and λ {0.0, 0.001, 0.01, 0.1, 1.0, 10.0, 100.0}. For DISENTANGLEDSSL, we use β = 1.0 and λ = 10 3 for all datasets, except for MOSI where β = 0.01. For both molecular structures and phenotypes, we employ 3-layer MLP encoders with a hidden dimension of 2560.