reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning stochastic dynamics from snapshots through regularized unbalanced optimal transport

Authors: Zhenyi Zhang, Tiejun Li, Peijie Zhou

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The effectiveness of our method is demonstrated with a synthetic gene regulatory network, high-dimensional Gaussian Mixture Model, and single-cell RNA-seq data from blood development. Compared with other methods, our approach accurately identifies growth and transition patterns, eliminates false transitions, and constructs the Waddington developmental landscape.
Researcher Affiliation	Academia	1LMAM and School of Mathematical Sciences, Peking University. 2Center for Machine Learning Research, Peking University. 3NELBDA, Peking University. 4Center for Quantitative Biology, Peking University. 5AI for Science Institute, Beijing. EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1 Training Regularized Unbalanced Optimal Transport
Open Source Code	Yes	Our code is available at: https://github.com/zhenyiizhang/Deep RUOT.
Open Datasets	Yes	The effectiveness of our method is demonstrated with a synthetic gene regulatory network, high-dimensional Gaussian Mixture Model, and single-cell RNA-seq data from blood development. Next, we evaluate our algorithm on a real sc RNA-seq dataset. We use the same dataset as in (Sha et al., 2024; Weinreb et al., 2020), which involves mouse hematopoiesis analyzed by using a lineage tracing technique.
Dataset Splits	No	The paper describes how initial cells are chosen for the synthetic gene regulatory network and mentions using scRNA-seq data from various time points, but it does not specify explicit training/test/validation splits for model training or evaluation. The evaluation is done by learning dynamics from all time points and then comparing generated data to real data.
Hardware Specification	Yes	We use one A100 Nvidia GPU along with 16 CPU cores for computation at a shared high-performance computing cluster.
Software Dependencies	No	The paper mentions using the "Python Optimal Transport library (POT) (Flamary et al., 2021)" but does not provide a specific version number for POT or any other core software components like Python or PyTorch.
Experiment Setup	Yes	C.2 HYPERPARAMETERS SELECTION AND LOSS WEIGHTING Table 5: Parameter Settings for Different Datasets Across Two Training Stages (Regulatory Network (σ = 0.25), Mouse Hematopoiesis (σ = 0.25), EMT (σ = 0.05), Gaussian Mixtures (σ = 0.1)). The arrow ( ) indicates the scheduling of parameter values within the Pre-Training Stage, where the numbers on the left indicate the hyperparameter values before resetting, and the right ones indicate the values after resetting. For example, the (1.0, 0.1, 20) (0.0, 0.1, 10) entry in (λm, λd, Epochs) row denotes 20 epochs training when λm was 1.0, λd was 0.1, and another 10 epochs training after λm was reset to 0.0.