reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Efficient and Scalable Density Functional Theory Hamiltonian Prediction through Adaptive Sparsity

Authors: Erpai Luo, Xinran Wei, Lin Huang, Yunyang Li, Han Yang, Zaishuo Xia, Zun Wang, Chang Liu, Bin Shao, Jia Zhang

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive evaluations on QH9 and Pubchem QH datasets demonstrate that SPHNet achieves state-of-theart accuracy while providing up to a 7x speedup over existing models. Our code can be found at https://github.com/microsoft/SPHNet. (Abstract) We first evaluated the performance of SPHNet on the QH9 dataset. ... For two stable split sets, SPHNet attained a speedup of 3.7x to 4x over QHNet while improving the accuracy of the Hamiltonian Mean Absolute Error (MAE) and the occupied energy MAE. Furthermore, SPHNet significantly reduced GPU memory usage, requiring only 30% memory usage of the baseline model.
Researcher Affiliation	Collaboration	1Department of Automation, Tsinghua University, Beijing, China 2Microsoft Research AI for Science, Beijing, China 3Department of Computer Science, Yale University, CT, USA 4Department of Computer Science, University of California, CA, USA. Correspondence to: Xinran Wei <EMAIL>, Lin Huang <EMAIL>, Jia Zhang <EMAIL>.
Pseudocode	No	The paper describes the methodology using textual explanations and mathematical equations in Section 4 (Methodology) and Appendix D (Additional Model Details), along with architectural diagrams in Figure 2 and Figure 9, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Our code can be found at https://github.com/microsoft/SPHNet.
Open Datasets	Yes	Three datasets were used in the experiments: The MD17 dataset includes Hamiltonian matrices for the trajectory data of four molecules... The QH9 dataset (Yu et al., 2024), consisting of Hamiltonian matrices for 134k small molecules... and the Pub Chem QH dataset (Li et al., 2025b), containing Hamiltonian matrices for 50k molecules...
Dataset Splits	Yes	The training and test sets in the QH9 dataset were split in the same manner as their official implementation, including four different split sets. The MD17 dataset was randomly split into the train, validation, and test sets of the same size as the QHNet experiment. The Pub Chem QH datasets were split randomly into train, validation, and test sets by 80%, 10%, and 10%. (Appendix A.1) and Table 7 "Model Training Settings for Different Datasets" provides specific train/val/test splits like "Random (500/500/3900)" for MD17 Water and "Random (40257/5032/5032)" for Pub Chem QH.
Hardware Specification	Yes	In our experiments, the speed and GPU memory metrics are tested on a single NVIDIA RTX A6000 46GB GPU. The complete training of the models is carried on 4 80GB Nvidia A100 GPU.
Software Dependencies	Yes	Our experiments are implemented based on Py Torch 2.1.0, Py Torch Geometric 2.5.0, and e3nn (Geiger et al., 2022) 0.5.1.
Experiment Setup	Yes	The batch size for training is 10 on the MD17 dataset (except Uracil which was set to 5), 32 on the QH9 dataset, and 8 on the Pub Chem QH dataset. The training step was 260,000 for the QH9 dataset, 200,000 for the MD17 dataset, and 300,000 for the Pub Chem QH dataset. There was a 1,000-step warmup with the polynomial schedule. The maximum learning rate was 1e-3 for the QH9 dataset and Pub Chem QH dataset and was 5e-4 for the MD17 dataset. (Appendix A.1). Table 7 further details "Model Training Settings for Different Datasets" including "Batch Size", "Training Steps", "Warmup Steps", "Learning Rate", "Lmax", "Sparsity Rate", and "TSS Epoch t" for each dataset.