NExT-Mol: 3D Diffusion Meets 1D Language Modeling for 3D Molecule Generation
Authors: Zhiyuan Liu, Yanchen Luo, Han Huang, Enzhi Zhang, Sihang Li, Junfeng Fang, Yaorui SHI, Xiang Wang, Kenji Kawaguchi, Tat-Seng Chua
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we evaluate NEx T-Mol s performance on de novo 3D molecule generation and conditional 3D molecule generation. Further, we report results of 3D conformer prediction, the critical second step in our two-step generation process. Finally, we present ablation studies to demonstrate the effectiveness of each component of NEx T-Mol. |
| Researcher Affiliation | Academia | 1 National University of Singapore, 2 University of Science and Technology of China, 3 Chinese University of Hong Kong, 4 Hokkaido University EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1 Training Algorithm 2 Sampling 3D Conformers |
| Open Source Code | Yes | Our codes and pretrained checkpoints are available at https://github.com/acharkq/NEx T-Mol. |
| Open Datasets | Yes | Datasets. As Table 1 shows, we evaluate on the popular GEOM-DRUGS (Axelrod & Gomez-Bombarelli, 2022), GEOM-QM9 (Axelrod & Gomez-Bombarelli, 2022), and QM92014 (Ramakrishnan et al., 2014) datasets. Among them, we focus on GEOM-DRUGS, which is the most pharmaceutically relevant and largest one. Due to different tasks incorporating different dataset splits, we separately fine-tune NEx T-Mol for each task without sharing weights. |
| Dataset Splits | Yes | Evaluation. Following (Wang et al., 2024; Jing et al., 2022), we use the dataset split of 243473/30433/1000 for GEOM-DRUGS and 106586/13323/1000 for GEOM-QM9, provided by (Ganea et al., 2021). |
| Hardware Specification | Yes | The training was done on 4 NVIDIA A100-40G GPUs and took approximately two weeks. |
| Software Dependencies | No | The paper mentions software components like Flash-Attention, FSDP, SELFIES, and RDKit, but does not provide specific version numbers for any of them in the text. |
| Experiment Setup | Yes | Table 17: Hyperparameter for pretraining Mo Llama. Table 18: Hyperparameters of the DMT-B and DMT-L models. DMT Settings. We use a dropout rate of 0.1 for QM9-2014 and 0.05 for GEOM-DRUGS. Following (Huang et al., 2024), we select only the conformer with the lowest energy for training on the GEOM-DRUGS dataset. For both datasets, we train DMT-B for 1000 epochs. The batch size for QM9-2014 is 2048 and the batch size for GEOM-DRUGS is 256. |