The Generalized Skew Spectrum of Graphs

Authors: Armando Bellante, Martin Plávala, Alessandro Luongo

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We illustrate our theoretical contributions with numerical experiments, demonstrating that our generalizations significantly improve the Skew Spectrum expressivity: distinguishing richer graphs, and distinguishing more non-isomorphic simple graphs at the same computational complexity. 7. Numerical experiments
Researcher Affiliation Collaboration 1Max-Planck-Institut f ur Quantenoptik, Hans-Kopfermann-Str. 1, 85748 Garching, Germany ... 6Centre for Quantum Technologies, National University of Singapore, Singapore 7Inveriant Pte. Ltd., Singapore.
Pseudocode Yes Algorithm 1 Doubly-Reduced k-Spectrum Algorithm 2 Precomputing s(k)
Open Source Code No On the practical side, (1) developing a scalable, optimized open-source implementation with thorough benchmarking is a crucial step toward real-world adoption.
Open Datasets Yes We tested the Multi-Orbit generalization on the QM7 dataset, which contains Coulomb matrices of 7,165 molecules with up to 23 atoms (Rupp et al., 2012; Blum & Reymond, 2009). ... we extended the multi-orbit experiments on QM7 presented in Section 7.1 (Table 1) to two larger molecular datasets: QM9 and ZINC. Both datasets were loaded through torch.geometric. ... using two datasets of non-isomorphic, unweighted, undirected graphs: the Atlas of all graphs with 7 nodes, and a set of connected chordal graphs with 8 nodes (Mc Kay).
Dataset Splits Yes We train several models on a 80%-20% split: Extreme Gradient Boosting (XGB), Gradient Boosting Regressor (GBR), Elastic Net (EN), Linear Regression (Linear) (Pedregosa et al., 2011). ... We used the first 100,000 molecules for training and the remaining 30,831 for testing. ZINC contains 249,456 molecules with up to 38 nodes. We used the default train/test/validation split from torch.geometric, training on 220,011 molecules and testing on 5,000, ignoring the validation set for simplicity. ... We experiment with two dropout rates: the original 0.5 and a lower value of 0.2.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper mentions "scikit-learn" (Pedregosa et al., 2011) and "torch.geometric" but does not specify version numbers for these or other key software components used in their implementation.
Experiment Setup Yes A Random Forest classifier (60 estimators, no max depth) ... We trained multiple regression models on these representations: Extreme Gradient Boosting (XGB), Gradient Boosting Regressor (GBR), Elastic Net (EN), Linear Regression (Linear) (Pedregosa et al., 2011). ... including a learning rate of 0.001, a batch size of 32, and a maximum of 1000 training epochs. Early stopping is applied based on validation loss. We experiment with two dropout rates: the original 0.5 and a lower value of 0.2.