reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Beyond Atoms: Enhancing Molecular Pretrained Representations with 3D Space Modeling

Authors: Shuqi Lu, Xiaohong Ji, Bohang Zhang, Lin Yao, Siyuan Liu, Zhifeng Gao, Linfeng Zhang, Guolin Ke

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments show that Space Former significantly outperforms previous 3D MPR models across various downstream tasks with limited data, validating the benefit of leveraging the additional 3D space beyond atoms in MPR models. [...] We conduct extensive experiments to evaluate the effectiveness and efficiency of Space Former. Across a total of 15 diverse downstream tasks, Space Former achieves the best performance on 10 tasks and ranks within the top 2 on 14 tasks. Ablation studies further confirm that each component plays a critical role in enhancing either the performance or efficiency of Space Former.
Researcher Affiliation	Collaboration	1DP Technology, Beijing, China 2Peking University, Beijing, China. Correspondence to: Guolin Ke <EMAIL>.
Pseudocode	No	The paper describes the model architecture and components in detail, including 'grid-based space discretization', 'grid sampling/merging', and 'efficient 3D positional encoding', but it does not present these as structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statement about releasing source code or a link to a code repository.
Open Datasets	Yes	Pretraining settings. We use the same pretraining dataset as Zhou et al. (2023), which contains a total of 19 million molecules. [...] For computational properties, we sample a 20K subset from the huge dataset of GDB-17 (Ramakrishnan et al., 2014; Ruddigkeit et al., 2012) and select the electronic properties HOMO, LUMO and GAP. Additionally, we use another 21k subset from the same dataset following (Ramakrishnan et al., 2015), selecting the energy properties E1-CC2, E2-CC2, f1-CC2 and f2-CC2. Furthermore, we incorporate the dataset (Wahab et al., 2022) of cata-condensed polybenzenoid hydrocarbons [...] For experimental properties, we select the BBBP and BACE datasets from Molecule Net, ensuring that all duplicate and structurally invalid molecules were excluded. Additionally, we employ the HLM, MDR1-MDCK ER (MME), and Solubility (Solu) datasets from the Biogen ADME dataset (Fang et al., 2023).
Dataset Splits	Yes	In all tasks, datasets were split into training, validation, and test sets in an 8:1:1 ratio. We applied the Out-of-Distribution splitting methods, where the sets are divided based on scaffold similarity.
Hardware Specification	Yes	This configuration results in a model with approximately 67.8M (encoder) parameters and requires about 50 hours of training using 8 NVIDIA A100 GPUs.
Software Dependencies	No	The paper mentions techniques like Flash Attention (Dao et al., 2022) and Ro PE (Su et al., 2024), but does not provide specific version numbers for any software libraries or dependencies used in their implementation.
Experiment Setup	Yes	The pretraining settings are detailed in Table 5, the downstream finetuning settings in Table 6, and the downstream tasks in Table 7. [...] Table 5. Pretraining Settings Hyper-parameters Value Peak learning rate 1e-4 [...] Batch size 128 [...] Mask ratio 0.3 Cell edge length cl 0.49 A [...] Table 6. Fine-tuning Settings Hyper-parameters Value Peak learning rate [5e-5, 1e-4] Batch size [32, 64] Epochs 200 Pooler dropout [0.0, 0.1]