reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Evaluating Graph Generative Models with Graph Kernels: What Structural Characteristics Are Captured?

Authors: Martijn Gösgens, Alexey Tikhonov, Liudmila Prokhorenkova

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To conduct a detailed analysis, we propose a framework for comparing graph kernels in terms of which high-level structural properties they are sensitive to... We show that using such diverse models with the corresponding transitions is crucial for evaluation: many kernels can successfully capture some properties and fail on others. We also found some well-known kernels that show good performance in our experiments... The results are shown in Table 1 and the respective computation times are reported in Appendix B.
Researcher Affiliation	Collaboration	Martijn Gösgens EMAIL Alexey Tikhonov EMAIL Independent researcher Liudmila Prokhorenkova EMAIL Yandex Research This research was conducted while Martijn Gösgens was employed by Eindhoven University of Technology.
Pseudocode	No	The paper describes various graph kernel algorithms and graph generation models but does not present any of them in a structured pseudocode block or algorithm listing.
Open Source Code	Yes	Our code and experiments are available at https://github.com/Martijn Gosgens/graph-kernels.
Open Datasets	Yes	We use the Erdős-Rényi (ER) random graph model as baseline generator... We use the Chung-Lu model... The simplest generative model for community structure is the Planted Partition (PP) model... This model is referred to as a random geometric graph... To model varying dimensionality, we use the random geometric graph model...
Dataset Splits	No	The paper describes generating graphs for experimental evaluation (e.g., 'We consider sets of g = 100 graphs and compute s = 30 different MMD values'), rather than using fixed training/test/validation splits from a pre-existing dataset. The concept of train/test/validation splits, as typically defined for model training, does not apply to this methodology.
Hardware Specification	Yes	The experiments were conducted on a laptop with AMD Ryzen 7 8840HS CPU and 16GB RAM.
Software Dependencies	No	For most graph kernels, we use the Gra Ke L python library (Siglidis et al., 2020). For Net LSD and Rand GIN we use the implementation provided by the authors with the default parameters. Specific version numbers for these libraries or other software are not provided.
Experiment Setup	Yes	In most of our experiments, we consider graphs with n = 50 nodes and (in expectation) m = 190 edges. We discretize the interpolation interval [0, 1] by Θ = {0.0, 0.1, . . . , 1.0}. Thus, we have \|Θ\| = 11 steps in our interpolation. We consider sets of g = 100 graphs and compute s = 30 different MMD values for each pair of interpolation steps that we compare... For the considered kernels, we use their default hyperparameters listed above (that are either the default parameters of the implementation or the most commonly used parameters in the literature).