reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

MADGEN: Mass-Spec attends to De Novo Molecular generation

Authors: Yinkai Wang, Xiaohui Chen, Liping Liu, Soha Hassoun

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate MADGEN on three datasets (NIST23, CANOPUS, and Mass Spec Gym) and evaluate MADGEN s performance with a predictive scaffold retriever and with an oracle retriever. We demonstrate the effectiveness of using attention to integrate spectral information throughout the generation process to achieve strong results with the oracle retriever.
Researcher Affiliation	Academia	Yinkai Wang, Xiaohui Chen, Liping Liu, Soha Hassoun Department of Computer Science Tufts University EMAIL
Pseudocode	No	The paper describes the methodology and model architectures using text, mathematical formulations, and diagrams (Figure 1, Figure 2) but does not include any explicit pseudocode blocks or algorithms labeled as such.
Open Source Code	Yes	Our code is available at https://github.com/Hassoun Lab/MADGEN
Open Datasets	Yes	We evaluate the performance of MADGEN on three datasets (Table 1). The NIST23 dataset (National Institute of Standards and Technology (NIST), 2023)... The CANOPUS dataset... The newly developed Mass Spec Gym benchmark dataset (Bushuiev et al., 2024) is collected from many public reference spectral databases and curated uniformly.
Dataset Splits	Yes	The NIST23 and CANOPUS datasets were split into training, validation, and test sets based on the scaffold, ensuring that scaffolds are unique to each split. This split prevents data leakage and ensures robust evaluation of model performance. For Mass Spec Gym, we utilized the split suggested by the benchmark (Bushuiev et al., 2024), which is based on the Maximum Common Edge Substructure (MCES).
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies	No	The paper mentions tools like RDKit but does not provide specific version numbers for software dependencies or libraries used for implementation.
Experiment Setup	Yes	The model was trained using a graph transformer with 5 layers and 50 diffusion steps. We employed the Adam W optimizer with a learning rate of 1 10 5. Full training details and hyperparameters can be found in Appendix A.2. Appendix A.2: The model is trained with a batch size of 64 and employed 47 workers for data loading. The learning rate is set to 2 10 4, while weight decay is configured at 1 10 12. Training proceeds for 2000 epochs, with the model logging progress every 40 steps. A Markov bridge process with 100 steps is employed during training, and a cosine noise schedule is employed. The model consists of 5 layers, with node, edge, and spectral features set at 64 dimensions each. The MLP hidden dimensions are configured to 256 for nodes, 128 for edges, and 256 for spectral features. The model also employs 8 attention heads for cross-attention and self-attention mechanisms. The feedforward dimensions are set to 256 for nodes, 128 for edges, and 128 for global features.