reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

UniGEM: A Unified Approach to Generation and Property Prediction for Molecules

Authors: Shikun Feng, Yuyan Ni, Lu yan, Zhi-Ming Ma, Wei-Ying Ma, Yanyan Lan

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental results demonstrate that Uni GEM achieves superior performance in both the predictive and generative tasks. Specifically, in molecule generation, Uni GEM produces significantly more stable molecules, with molecular stability improving by approximately 10% over the diffusionbased molecular generation model EDM (Hoogeboom et al., 2022). In terms of property prediction, our method significantly outperforms the baseline trained from scratch, even achieving accuracy comparable to some pre-training methods without introducing any additional pre-training steps and large-scale unlabeled molecular data.
Researcher Affiliation	Academia	1Institute for AI Industry Research (AIR), Tsinghua University 2Academy of Mathematics and Systems Science, Chinese Academy of Sciences 3University of Chinese Academy of Sciences 4Department of Computer Science and Technology, Tsinghua University 5Beijing Academy of Artificial Intelligence. Correspondence to EMAIL.
Pseudocode	No	The paper describes the methodology using textual explanations and figures (e.g., Figure 1, Figure 3, Figure 4) but does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks with structured steps.
Open Source Code	Yes	1The code is released publicly at https://github.com/fengshikun/UniGEM.
Open Datasets	Yes	Datasets In our experiments, we utilize two widely used datasets. The QM9 dataset (Ruddigkeit et al., 2012; Ramakrishnan et al., 2014) serves both molecular generation and property prediction tasks, comprising approximately 134,000 small organic molecules with up to nine heavy atoms, each annotated with quantum chemical properties such as LUMO (Lowest Unoccupied Molecular Orbital), HOMO (Highest Occupied Molecular Orbital), HOMO-LUMO gap, polarizability (α). These properties are used as ground-truth labels for property prediction. In contrast, the GEOM-Drugs dataset (Axelrod & Gomez-Bombarelli, 2022) focuses on drug-like molecules, providing a large-scale collection of molecular conformers.
Dataset Splits	Yes	The splitting strategies for both benchmarks follow the previous practices. For the QM9 dataset, we adopt the same split as in prior methods (Hoogeboom et al., 2022; Satorras et al., 2021; Anderson et al., 2019), dividing the data into training (100K), validation (18K), and test (13K) sets. Similarly, for the GEOM-Drugs dataset, we follow the approach outlined in Hoogeboom et al. (2022) to randomly split the dataset into training, validation, and test sets in an 8:1:1 ratio.
Hardware Specification	Yes	For QM9, we trained for 2000 epochs with a batch size of 64, which took approximately 7 days on a single A100 GPU. For GEOM-drugs, we trained for 13 epochs with a batch size of 32x4=128, taking 6.5 days on four A100 GPUs.
Software Dependencies	No	The paper mentions using 'EGNN (Satorras et al., 2021) as our backbone' and 'Adam optimizer', but does not specify version numbers for general software dependencies like Python, PyTorch, or TensorFlow.
Experiment Setup	Yes	We implement Uni GEM with 9 layers, each consisting of 256 dimensions for the hidden layers. The shared layer count is set to 1, while each separate branch, corresponding to different ranges of time steps, has 8 layers. Optimization is performed using the Adam optimizer, with a batch size of 64 and a learning rate of 10 4. We train Uni GEM on QM9 for 2,000 epochs, while training on the GEOM-Drugs dataset is limited to 13 epochs due to the larger scale of its training data. We set the nucleation time tn to 10 and the total time steps T to 1000, ensuring that the prediction of atom types and property occurs within the first 10 time steps. The atom type and molecular property predicted in the last time step are used as the final predictive result. Table 10: Network and training hyperparameters: Embedding size 256, Layer number 9 for QM9, 4 for Geom-Drugs, Shared layers 1, Batch size 64 for QM9, 32 for Geom-Drugs, Train epoch 2000 for QM9, 13 for Geom-Drugs, Learning rate 1.00 10 4, Optimizer Adam, T (sample times) 1000, Nucleation time 10, Oversampling ratio 0.5 for each branch, Loss weight 1 for each loss term.