reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Offline Model-based Optimization for Real-World Molecular Discovery

Authors: Dong-Hee Shin, Young-Han Son, Hyun Jung Lee, Deok-Joong Lee, Tae-Eui Kam

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on various offline multi-objective molecular optimization problems validate the effectiveness of Mol Stitch. The source code is available online.
Researcher Affiliation	Academia	1Department of Artificial Intelligence, Korea University, Seoul, Republic of Korea. Correspondence to: Tae-Eui Kam <EMAIL>.
Pseudocode	Yes	More details of the generative model s loss function are in Appendix H, and the pseudocode for our Mol Stitch framework is in Appendix K.
Open Source Code	Yes	The source code is available online. Additionally, the source code for our proposed framework is available online at https://github.com/Molecular Team/Mol Stitch.
Open Datasets	Yes	In the first stage of our framework, we perform unsupervised pre-training for Stitch Net using the publicly available ZINC dataset (Sterling & Irwin, 2015). To construct the offline datasets for both experiments, we utilized the ZINC dataset (Sterling & Irwin, 2015), which is a publicly available chemical database that provides a collection of commercially available compounds.
Dataset Splits	Yes	For the MPO task, the total number of oracle calls was limited to 10,000 (Gao et al., 2022). Following this guideline, we allocated 5,000 calls to construct the offline dataset and reserved the remaining 5,000 for evaluation... For the docking score optimization task, the total number of oracle calls was restricted to 3,000 (Lee et al., 2023)... we allocated 1,500 oracle calls to construct the offline dataset and the remaining 1,500 to evaluate the performance...
Hardware Specification	No	The paper does not provide specific details about the hardware used for experiments. It mentions the number of 'oracle calls' for dataset collection and evaluation but no information on CPUs, GPUs, or other computational resources.
Software Dependencies	No	The paper mentions generative models like REINVENT, Mamba, and GFlow Nets, and refers to hyperparameters for them in Table 17, but it does not specify any software libraries with version numbers (e.g., Python, PyTorch, RDKit, etc.).
Experiment Setup	Yes	The final hyperparameters for the generative models were primarily determined based on the performance of REINVENT, which served as our backbone generative model, and are detailed in Table 17. Table 17. The hyperparameter settings for generative models in Mol Stitch framework. Table 18. The hyperparameter settings for Stitch Net.