reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Curriculum-aware Training for Discriminating Molecular Property Prediction Models

Authors: Hansi Yang, Quanming Yao, James Kwok

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive evaluation on various molecular property prediction datasets validate the effectiveness of our approach. ... Empirical results on various molecular property data sets demonstrate effectiveness of the proposed method. ... In this section, we demonstrate the performance of the proposed method on both classification data sets (Section 5.1) popularly used in existing works (Stärk et al., 2022; Wang et al., 2023b; Zhou et al., 2023) and regression data sets (Section 5.2) that are more common in real-world application (van Tilborg et al., 2022). Section 5.3 presents ablation studies to verify the effectiveness of each component in the proposed method. The effect of the hyper-parameters that define R(t) are studied in Section 5.4. Section 5.5 further visualizes the loss distribution on molecules. Section 5.6 presents case studies to better understand the proposed method.
Researcher Affiliation	Academia	Hansi Yang Department of Computer Science and Engineering Hong Kong University of Science and Technology Hong Kong, China EMAIL Quanming Yao Department of Electronic Engineering, State Key Laboratory of Space Network and Communications, Tsinghua University Beijing, China EMAIL James Kwok Department of Computer Science and Engineering Hong Kong University of Science and Technology Hong Kong, China EMAIL
Pseudocode	Yes	Algorithm 1 Learning with Activity Cliff (LAC). 1: Initialize prediction model f with parameter w (random initialization or pre-trained weights); 2: for t = 0, . . . , T 1 do 3: Draw a mini-batch B from molecule data set D; 4: Obtain the set A of molecule pairs in B with activity cliff; 5: Determine R(t); 6: Select R(t) \|B\| large-loss samples ˆB from B based on network f s predictions; 7: Select R(t) \|A\| pairs of molecule ˆ A with activity pairs and compute Le in (4); 8: Update w = w η w(L(w; ˆB) + αLe(w; ˆ A)); 9: end for
Open Source Code	No	The paper does not provide a specific repository link, an explicit statement of code release, or mention code in supplementary materials for their methodology. It mentions using existing models like GIN, Graph GPS, 3D-PGT, Uni Mol, and MLP(ECFP), but not releasing their own implementation of LAC.
Open Datasets	Yes	We consider four tasks from the Tox21 data set (Wu et al., 2018) which predict a molecule s response to different receptors (NR-Ah R, NR-ER, SR-ARE and SR-MMP). ... We perform experiments on eight classification tasks from the Molecule Net (Wu et al., 2018): Tox21, Tox Cast, Sider, MUV, Bace, BBBP, Clin Tox and HIV. ... we select five data sets from the Ch EMBL database (Zdrazil et al., 2023), which describe the (continuous) bioactivity values of molecules to a specific target.
Dataset Splits	Yes	For the classification experiments, the data splits of all data sets in our experiments follow the scaffold split in (Wang et al., 2023b). For the regression experiments, the data splits of all data sets in our experiments are the same as in (van Tilborg et al., 2022), and we use an three-layer MLP model with input dimension 1024 and hidden dimension 512 for all hidden layers.
Hardware Specification	Yes	All experiments are run on a single NVIDIA RTX A6000 GPU.
Software Dependencies	No	The paper mentions using the Adam optimizer (Kingma & Ba, 2015), but does not specify version numbers for any programming languages (e.g., Python), libraries (e.g., PyTorch, TensorFlow), or other software components used for implementation.
Experiment Setup	Yes	All experiments are run on a single NVIDIA RTX A6000 GPU. For all experiments in this work, we use the Adam optimizer (Kingma & Ba, 2015), and follow its default hyper-parameters: learning rate η is set 0.001, first-order momentum weight β1 is set to 0.9, and the second-order momentum weight β2 is set to 0.99. The batch size is set to 256 for all data sets. Unless otherwise specified, we set the R(t) schedule as R(t) = λ min(t/(γT), 1) with λ = 0.2 and γ = 0.1, and the weight α for pairwise loss Le is set to 0.1.