MORE: Molecule Pretraining with Multi-Level Pretext Task

Authors: Yeongyeong Son, Dasom Noh, Gyoungyoung Heo, Gyoung Jin Park, Sunyoung Kwon

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate that our proposed method effectively learns comprehensive representations by showing outstanding performance in both linear probing and full fine-tuning. We evaluate downstream task performance under two settings: 1) Linear probing... 2) Full fine-tuning... Table 3 shows the prediction performance of MORE and the other eight models under both linear probing and full fine-tuning.
Researcher Affiliation Academia 1Department of Information Convergence Engineering, Pusan National University, Korea 2School of Biomedical Convergence Engineering, Pusan National University, Korea 3Center for Artificial Intelligence Research, Pusan National University, Korea EMAIL
Pseudocode No The paper describes the methodology using mathematical equations (e.g., Equation 1-10) and descriptive text, along with architectural diagrams (Figure 2), but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code https://github.com/IT-fatica/MORE
Open Datasets Yes We use 2 million unlabeled molecules from ZINC15 (Sterling and Irwin 2015)... We use seven graph classification datasets from Molecule Net (Wu et al. 2018), detailed in Table 1.
Dataset Splits Yes The data are randomly split into training and validation sets in a 9:1 ratio... The downstream datasets are split into train/validation/test sets in an 8 : 1 : 1 ratio based on scaffolds (molecular substructures).
Hardware Specification No The paper does not provide specific details about the hardware used for experiments, such as GPU models, CPU types, or memory specifications. It only describes the model architecture and training settings.
Software Dependencies No The paper mentions the use of the RDKit library (Landrum et al. 2020) for tasks like generating MACCS keys and conformers, but it does not specify the version number of RDKit or any other software dependencies.
Experiment Setup Yes The encoder uses a 5-layer Graph Isomorphism Network (GIN) (Xu et al. 2018) with 300 hidden units. We set the masking ratio to 25%. The node-level decoder comprises a 1-layer Multi-Layer Perceptron (MLP) and a 1-layer GIN. The subgraph- and graph-level decoders are 2-layer MLP with 256 hidden units and output shapes. The 3D-level decoder is a 3-layer MLP with hidden units of 256, 128, and 30. The downstream task decoder is a 1-layer MLP. Specifically, for the loss function, we set λ1 = 4.5, λ2 = 5.0, λ3 = 1.0, and λ4 = 0.04.