reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Decomposed Direct Preference Optimization for Structure-Based Drug Design

Authors: Xiwei Cheng, Xiangxin Zhou, Yuwei Yang, Yu Bao, Quanquan Gu

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on the Cross Docked2020 benchmark show that Decomp Dpo significantly improves model performance, achieving up to 98.5% Med. High Affinity and a 43.9% success rate for molecule generation, and 100% Med. High Affinity and a 52.1% success rate for targeted molecule optimization. Code is available at https://github.com/laviaf/Decomp DPO.
Researcher Affiliation	Collaboration	Xiwei Cheng EMAIL Khoury College of Computer Sciences, Northeastern University Xiangxin Zhou EMAIL Byte Dance Seed School of Artificial Intelligence, University of Chinese Academy of Sciences New Laboratory of Pattern Recognition (NLPR), State Key Laboratory of Multimodal Artificial Intelligence Systems (MAIS), Institute of Automation, Chinese Academy of Sciences (CASIA) Yu Bao EMAIL Byte Dance Seed Yuwei Yang EMAIL Byte Dance Quanquan Gu EMAIL Byte Dance Seed
Pseudocode	No	The paper describes the methodology using equations and textual descriptions, but it does not include a clearly labeled pseudocode or algorithm block.
Open Source Code	Yes	Code is available at https://github.com/laviaf/Decomp DPO.
Open Datasets	Yes	We followed prior work (Luo et al., 2021; Peng et al., 2022; Guan et al., 2023a;b), using the Cross Docked2020 dataset (Francoeur et al., 2020) to pre-train our reference model and evaluate the performance of Decomp Dpo.
Dataset Splits	Yes	According to the protocol established by Luo et al. (2021), we filtered complexes to retain only those with high-quality docking poses (RMSD < 1Å) and diverse protein sequences (sequence identity < 30%), resulting in a refined dataset comprising 100,000 high-quality training complexes and 100 novel proteins for evaluation.
Hardware Specification	Yes	The model is pre-trained on a single NVIDIA A6000 GPU, and it could converge within 21 hours and 170k steps. For fine-tuning the model for molecule generation, we set βT = 0.001 and trained for 30,000 steps on one NVIDIA A40 GPU. For molecular optimization, we set βT = 0.02 and trained for 20,000 steps on one NVIDIA V100 GPU.
Software Dependencies	No	The paper mentions using Adam (Kingma & Ba, 2014) as an optimizer and RDKit and Alphaspace2 (Katigbak et al., 2020) toolkit for molecular fragmentation, but it does not provide specific version numbers for any software libraries or tools used in the experimental setup.
Experiment Setup	Yes	Pre-training We use Adam (Kingma & Ba, 2014) for pre-training, with init_learning_rate=0.0004 and betas=(0.95,0.999). The learning rate is scheduled to decay exponentially with a factor of 0.6 with minimize_learning_rate=1e-6. The learning rate is decayed if there is no improvement for the validation loss in 10 consecutive evaluations. We set batch_size=8 and clip_gradient_norm=8. During training, a small Gaussian noise with a standard deviation of 0.1 to protein atom positions is added as data augmentation. To balance the magnitude of different losses, the reconstruction losses of atom and bond type are multiplied by weights γv = 100 and γb = 100, respectively. We perform evaluations every 2000 training steps. ... Fine-tuning and Optimizing For both fine-tuning and optimizing model with Decomp Dpo, we use the Adam optimizer with init_learning_rate=1e-6 and betas=(0.95,0.999). We maintain a constant learning rate throughout both processes. We set batch_size=4 and clip_gradient_norm=8. ... For fine-tuning the model for molecule generation, we set βT = 0.001 and trained for 30,000 steps on one NVIDIA A40 GPU. For molecular optimization, we set βT = 0.02 and trained for 20,000 steps on one NVIDIA V100 GPU. ... The λ used for penalizing rewards with energy terms proposed in Section 3.3 is set to 0.1.