Learning the Electronic Hamiltonian of Large Atomic Structures
Authors: Chen Hao Xia, Manasa Kaniselvan, Alexandros Nikolaos Ziogas, Marko Mladenović, Rayen Mahjoub, Alexander Maeder, Mathieu Luisier
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate its capabilities by predicting the electronic Hamiltonian of various systems with up to 3,000 nodes (atoms), 500,000+ edges, 28 million orbital interactions (nonzero entries of H), and 0.53% error in the eigenvalue spectra. |
| Researcher Affiliation | Academia | 1Integrated Systems Laboratory, Department of Information Technology and Electrical Engineering, ETH Zurich, Zurich, Switzerland. Correspondence to: Chen Hao Xia <EMAIL>, Manasa Kaniselvan <EMAIL>. |
| Pseudocode | Yes | In Algorithm 1, we detail the procedure to partition the full graph G, described by the set of vertices V and edges E, into a set of slices {G1 . . . GN} which are augmented by virtual nodes and edges. |
| Open Source Code | No | The paper does not provide an explicit statement or link to the source code for the methodology described. |
| Open Datasets | Yes | The datasets are publicly available at https://huggingface.co/datasets/chexia8/Amorphous-Hamiltonians. |
| Dataset Splits | Yes | Material Structure Purpose rcut [ A] # atoms # orbitals # edges x [ A] y [ A] z [ A] a-Hf O2 1 validate 8 3,000 18,000 527,348 52.876 26.308 26.242 a-Hf O2 2 train 8 3,000 18,000 533,364 52.346 26.237 26.293 a-Hf O2 3 test 8 3,000 18,000 530,920 52.722 26.267 26.191 ... Table 1. Attributes of the generated dataset for three materials, each with its own training, validation, and test set: The [x, y, z] triplet defines the periodic unit cell size. Dataset Model Train Validate Test Batch size ϵtot [µEh] water QHNet 500 500 3900 10 10.79 This work 500 500 3900 10 5.60 |
| Hardware Specification | Yes | During the training of the full graph model, the peak memory consumption observed was 61.68 Gi B on a single NVIDIA A100 GPU. Experiments were run on NVIDIA A100 GPUs with # ranks set to Nt. The computation time per H2O molecule was 7s, when run on 12 nodes with 12-core Intel Xeon E5-2680 CPUs and NVIDIA P100 GPU, resulting in a total of 0.04 node hours. |
| Software Dependencies | No | The paper mentions several software tools and libraries by name (e.g., PyTorch, CP2K, LAMMPS, ASE), but does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | Hyper-parameters a-Hf O2/Pt Ge/GST dataset Optimizer Adam Precision single (f32) Scheduler Reduce LROn Plateau Initial learning rate 1 10 4 Minimum learning rate 1 10 5 Decay patience tpatience 500 Decay factor 0.5 Threshold 1 10 3 Maximum degree Lmax 4 Maximum order Mmax 4 Embedding size 16 Number of attention heads Nh 2 Feedforward Network Dimension 64 Table 7. Hyper-parameters used for a-Hf O2, a-Pt Ge and a-GST data. |