reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Generating Graphs via Spectral Diffusion

Authors: GIORGIA MINELLO, Alessandro Bicciato, Luca Rossi, Andrea Torsello, Luca Cosmo

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	An extensive set of experiments on both synthetic and real-world graphs demonstrates the strengths of our model against state-of-the-art alternatives.
Researcher Affiliation	Academia	Giorgia Minello Ca Foscari University EMAIL Alessandro Bicciato Ca Foscari University EMAIL Luca Rossi The Hong Kong Polytechnic University EMAIL Andrea Torsello Ca Foscari University EMAIL Luca Cosmo Ca Foscari University EMAIL
Pseudocode	No	The paper describes the proposed pipeline and neural network architectures with diagrams (Figure 1 and Figure 2) and textual descriptions in Section 4, but it does not contain any explicit pseudocode or algorithm blocks.
Open Source Code	Yes	In order to guarantee the reproducibility of both our model architecture and results, we have made our code accessible on an online public repository 2
Open Datasets	Yes	The synthetic datasets we consider are (i) Community-small (12 \|V \| 20), (ii) Planar (\|V \| = 64), and (iii) Stochastic Block Model (SBM) (2-5 communities and 20-40 nodes per community). The real-world datasets are both from the molecular domain, namely (i) Proteins (100-500 nodes) Dobson & Doig (2003) and (ii) QM9 (9 nodes) Ruddigkeit et al. (2012); Ramakrishnan et al. (2014).
Dataset Splits	Yes	for the synthetic datasets, we decided to create a larger set of test graphs: 200 graphs for Planar and SBM, and 100 graphs for community-small. Accordingly, we let each model generate an equivalent number of graphs (200 for Planar and SBM, 100 for community-small) to compute the MMD measures. Due to the limited number of graphs in the Proteins dataset (see Appendix A), we also followed a different and more robust protocol to evaluate the generated graphs on this dataset. Rather than utilizing a single subset of the dataset as a test set, we created 10 folds (identical for each method) allowing us to report the average of each metric ( standard deviation) over the 10 folds. ... For the training of the diffusion model, we split each dataset into 90% train and 10% test... For QM9, we allocate 10k molecules for validation, 10k for testing, and the rest for training.
Hardware Specification	Yes	These experiments were conducted on a computer equipped with an AMD Ryzen 7 3700X processor, 64GB of RAM, and an NVIDIA RTX 3070 8GB graphics card.
Software Dependencies	No	The paper mentions using RDKit for validity checking for molecule graphs (QM9) but does not provide a specific version number. It also implicitly relies on deep learning frameworks but does not list any with version numbers.
Experiment Setup	Yes	For the training of the diffusion model, we split each dataset into 90% train and 10% test, and we train the Spectral Diffusion on the whole dataset for 100k epochs, using early stopping on the reconstruction loss. We performed a grid search on the number of layers between 6, 9 and 12, and selected the best model according to the degree metric computed from the graphs reconstructed directly from the eigenvectors/values and the graphs of the training set. The sampling has been done using DDIM with 200 steps. Moreover, we generate each sample 4 times and keep the one with the lower deviation from orthogonality. For the training of the Graph Predictor, we used the same splits of the Spectral Diffusion, and trained for 100k epochs. We performed early stopping by comparing the degree distribution of the generated graphs with the training graphs. We used 6 PPGN layers and 3 PPGN layers for the Graph Predictor and the discriminator network respectively, except for QM9 in which also the Graph Predictor is composed of three layers. For QM9, we let the Graph Predictor to generate also edge features, similarly to Martinkus et al. (2022). For all datasets, following the observations in Appendix E, we train both Spectral Diffusion and Predictor on the 16 smallest and 32 largest eigenpairs and select the final model according to the best average metrics on the validation set.