reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

CheapNet: Cross-attention on Hierarchical representations for Efficient protein-ligand binding Affinity Prediction

Authors: Hyukjun Lim, Sun Kim, Sangseon Lee

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we evaluate our Cheap Net on various protein-ligand affinity tasks including ligand binding affinity (LBA) prediction, ligand efficacy prediction (LEP). Detailed hyperparameter setting and experimental setup are provided in Appendix A1. We provide comprehensive comparisons with state-of-the-art methods, as well as ablation studies to assess the contributions of individual components. The code is available at https://github.com/hyukjunlim/Cheap Net.
Researcher Affiliation	Collaboration	1Department of Materials Science and Engineering, Seoul National University 2Department of Computer Science and Engineering, Seoul National University 3Interdisciplinary Program in Bioinformatics, Seoul National University 4Interdisciplinary Program in Artificial Intelligence, Seoul National University 5AIGENDRUG Co., Ltd., Seoul, Republic of Korea 6Department of Artificial Intelligence, Inha University EMAIL, EMAIL
Pseudocode	Yes	A.2 PSEUDOCODE OF CHEAPNET. In this section, we provide the detailed pseudocode for Cheap Net, which outlines the step-by-step process for predicting protein-ligand binding affinity (Algorithm 1).
Open Source Code	Yes	The code of Cheap Net is available at https://github.com/hyukjunlim/Cheap Net.
Open Datasets	Yes	We evaluate Cheap Net on the widely-used PDBbind dataset (Liu et al., 2017b), which contains 3D structures of protein-ligand complexes with experimentally measured binding affinities. ...We evaluate Cheap Net on the CSAR NRC-Hi Q dataset (Dunbar Jr et al., 2013), an external benchmark for protein-ligand binding affinity prediction. ...We assess the ranking power and scoring power of Cheap Net using the CASF-2016 dataset (Su et al., 2018). ...we curated a dataset from the well-established DUD-E dataset (Mysinger et al., 2012)... For evaluation, we use the Protein-Protein Affinity Benchmark Version 2 (Vreven et al., 2015)
Dataset Splits	Yes	For cross-dataset evaluation, the PDBbind v2016 general set is used as the training and validation dataset, while the v2013 core set, v2016 core set, and v2019 holdout set serve as test datasets. ...For diverse protein evaluation, the PDBbind v2019 refined set is used, with the LBA 30% and LBA 60% datasets split based on protein sequence identity thresholds of 30% and 60%, respectively. ...Ligand Binding Affinity / Diverse Protein Evaluation ...Sequence identity 30%: Training Set: 3,507 samples Validation Set: 466 samples Test Set: 490 samples. Sequence identity 60%: Training Set: 3,563 samples Validation Set: 448 samples Test Set: 452 samples.
Hardware Specification	Yes	All experiments were conducted on two separate NVIDIA RTX 3090 GPUs (24GB each), with each model running on a single GPU.
Software Dependencies	No	The paper mentions 'RDKit' in the context of data processing ('After removing complexes that RDKit could not process') and for 3D structure processing ('processed its 3D structures using RDKit'). However, no specific version number for RDKit or any other software dependency is provided.
Experiment Setup	Yes	Detailed hyperparameter setting and experimental setup are provided in Appendix A1. ...Table A1: The search space of hyperparameters for cross-dataset, LBA 30%, LBA 60%, LEP task. The optimal hyperparameters are shown in bold. Hyperparameters include: Activation function, Batch size, Cutoff-intra, Cutoff-inter, Dropout rate, Epoch, Hidden dim, Learning rate, LR scheduler, Optimizer, Weight decay, Number of clusters, Number of layers (Message Passing, Differentiable pooling, Prediction MLP).