reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Modeling Complex System Dynamics with Flow Matching Across Time and Conditions

Authors: Martin Rohbeck, Edward De Brouwer, Charlotte Bunne, Jan-Christian Huetter, Anne Biton, Kelvin Chen, Aviv Regev, Romain Lopez

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the effectiveness of our method on both synthetic and real-world datasets, including a recent single-cell genomics data set with around a hundred chemical perturbations across time points. Our results show that MMFM significantly outperforms existing methods at imputing data at missing time points. 5 EXPERIMENTS We assessed the performance of MMFM using synthetic data as well as a single-cell RNA-seq dataset where each of multiple conditions (perturbations) was measured along multiple time points. We compared the performance of MMFM to that of several other methods
Researcher Affiliation	Collaboration	1Genentech, USA 2Heidelberg University, Germany 3German Cancer Research Center, Germany 4European Molecular Biology Lab, Germany 5Stanford University, USA 6Osaka University, Japan
Pseudocode	Yes	Algorithm 1 Pseudocode: Sampling from COT-MMFM
Open Source Code	Yes	CODE AVAILABILITY STATEMENT The code to reproduce the figures and tables, as well as to run the model and generate the simulated data, can be found at github.com/Genentech/MMFM.
Open Datasets	Yes	To further study how well MMFM generalizes, especially under irregular sampling over time, we applied it to the Beijing multi-site air quality data set (Chen, 2017). This dataset comprises hourly air pollutant data from 12 air-quality monitoring sites across Beijing.
Dataset Splits	Yes	For evaluation purposes, we withheld ten non-overlapping random treatments for each of the three time points. For 9 out of 12 stations, we selected 50% of the measurements, i.e. 13 months. For the other three stations, we selected only 7, 6 and 7 months as training data to simulate missing sensor data. These three stations are represented by the conditions c = 4, c = 7 and c = 10. We evaluated our method on all months that were not part of the training data set.
Hardware Specification	No	The paper mentions running experiments "on a GPU" in the context of computational complexity (Table 9) but does not specify the model or type of GPU, or any other hardware components like CPU or memory.
Software Dependencies	No	The paper mentions using specific software components such as the "Adam optimizer (Kingma & Ba, 2015)", the "Python package POT", and the "scVI model (Lopez et al., 2018)" but does not provide specific version numbers for any of these.
Experiment Setup	Yes	Table 5: Hyperparameters for model training. () Applicable to all variations of Flow Matching models discussed in the paper. Model Hyperparameters Values/Range FSI MMFM learning rate [1e-2, 1e-3, 1e-4] pu [0.0, 0.1, 0.2, 0.3] latent dimensions (x,t,c) [16, 32, 64, 128, 256] flow variance [0.01, 0.1, 1, adaptive] guidance w [ k 10] for k {1, . . . , 10} {20, 30}