reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

MF-LAL: Drug Compound Generation Using Multi-Fidelity Latent Space Active Learning

Authors: Peter Eckmann, Dongxia Wu, Germano Heinzelmann, Michael K Gilson, Rose Yu

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments on two disease-relevant proteins show that MF-LAL produces compounds with significantly better binding free energy scores than other single and multi-fidelity approaches ( 50% improvement in mean binding free energy score). The paper also includes a dedicated section titled "4. Experiments" with subsections for setup, baselines, and results.
Researcher Affiliation	Academia	1Department of Computer Science and Engineering, UC San Diego, La Jolla, California, United States 2Departamento de F ısica, Universidade Federal de Santa Catarina, Brazil 3Department of Chemistry and Biochemistry, UC San Diego, La Jolla, California, United States 4Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, California, United States. Correspondence to: Peter Eckmann <EMAIL>, Michael K. Gilson <EMAIL>, Rose Yu <EMAIL>.
Pseudocode	Yes	The paper includes clearly labeled algorithm blocks: "Algorithm 1 Active learning for MF-LAL" and "Algorithm 2 MF-LAL molecule generation procedure".
Open Source Code	Yes	The code is available at https:// github.com/Rose-STL-Lab/MF-LAL.
Open Datasets	Yes	We used BRD4(2) and, separately, c-MET data from Binding DB (Liu et al., 2007) to train a simple linear regression model.
Dataset Splits	No	Each model was provided with an initial dataset of random ZINC250k (Irwin et al., 2012) compounds evaluated at each fidelity. Each fidelity had 5 random compounds selected, except for the first fidelity, where we supplied 200,000 compounds and associated oracle outputs. This describes initial dataset provisioning for active learning, but not explicit train/test/validation splits for model evaluation.
Hardware Specification	Yes	All experiments were conducted on a server with 8 RTX 2080 Ti GPUs.
Software Dependencies	No	We used Auto Dock-GPU (Santos-Martins et al., 2021), a GPU-accelerated version of Auto Dock4, for all computation. All molecular dynamics simulators are run with AMBER with GPU support. While specific software and tools are mentioned, precise version numbers for these components are not provided.
Experiment Setup	Yes	The encoder, decoder, and h networks are all 3-layer feed-forward networks with ReLU activations and a 512-dimensional hidden layer. Each latent space has 64 dimensions. At each active learning step, we train the whole model from scratch until convergence with the Adam optimizer using a learning rate of 0.0001. For the molecule generation procedure using gradient-based optimization, we use the Adam optimizer with a learning rate of 0.1 for 100 epochs. We set β = 1 during active learning, and β = 0 after during inference to only focus on the most promising compounds.