reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Maximally Expressive GNNs for Outerplanar Graphs

Authors: Franka Bause, Fabian Jogl, Patrick Indri, Tamara Drucks, David Penz, Nils Morten Kriege, Thomas Gärtner, Pascal Welke, Maximilian Thiessen

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments confirm that our graph transformation improves the predictive performance of MPNNs on molecular benchmark datasets at negligible computational overhead. Section 5: Experimental Evaluation. We investigate whether our proposed method CAT can improve the predictive performance of MPNNs on molecular benchmark datasets. We utilize three commonly used MPNNs: GIN (Xu et al., 2019), GCN (Kipf & Welling, 2017), and GAT-v2 (Veličković et al., 2018; Brody et al., 2022). We train on the commonly used ZINC (Gómez-Bombarelli et al., 2018; Sterling & Irwin, 2015) and MOLHIV (Hu et al., 2020) datasets, which contain graphs representing molecules. We supplement these with 7 small datasets (see Table 4) from the OGB collection (Hu et al., 2020). In addition to the ZINC dataset with 12k graphs we also use the larger ZINC250k variant with 250k graphs. In total, we train three MPNNs on ten datasets with and without CAT. For each configuration, we separately tune hyperparameters on the validation set and train a model with the best hyperparameters ten times on the training set and evaluate it on the test set. The only exception to this is ZINC250k where we evaluate the final hyperparameter configuration five times, due to the large dataset size. For each dataset we report the mean and standard deviation of the most common evaluation metric on the test set in the epoch with the best validation performance.
Researcher Affiliation	Academia	1Faculty of Computer Science, University of Vienna, Vienna, Austria 2Uni Vie Doctoral School Computer Science, University of Vienna, Vienna, Austria 3Machine Learning Research Unit, TU Wien, Vienna, Austria 4Center for Artificial Intelligence and Machine Learning, Vienna, Austria 5Multimedia Mining and Search, Johannes Kepler University Linz, Linz, Austria 6Research Network Data Science, University of Vienna, Vienna, Austria {firstname.lastname}@{univie, tuwien}.ac.at
Pseudocode	Yes	Definition 1. The CAT (G) = G transformation maps a biconnected outerplanar graph G = (V, E, µ, ν) to a new graph G = (V , E , µ , ν ) as follows: 1. Let (v1, . . . , vn) be the Hamiltonian cycle of G and C and C be its directed versions. 2. Add node and edge disjoint copies of C and C together with the corresponding edges to an empty graph G and e E set ν (e) = (1, ν(e)). 3. Let D E be the edges of G not on the (undirected) Hamiltonian cycle. Add edges in both directions to G for the copies of C and C for each edge in D: E = E Ed E d with Ed = S vivj D{v iv j, v jv i} and E d = S vivj D{v i v j , v j v i } for copies v i of vi in C (resp. v i in C ). 4. v i, v i V set µ (v i) = µ (v i ) = µ(vi). vivj Ed set ν (vivj) = (d C(vj, vi), ν(vivj)) and v iv j E d set ν (v iv j) = (d C (v j, v i), ν(vjvi)). Definition 2. The CAT(G) = G transformation maps a graph G to a new graph G as follows: 1. Let B1, . . . , Bℓbe the blocks of G and let F be the graph induced by the edges of G that are not in any block plus the nodes that are present in more than one block. Let { , , , , } be distinct node labels not in X. 2. Add F to G with labels µ (v) = ( , µ(v)) for all v F. 3. For each block Bi in G: 3.1. Let B i, B i be the two connected components in CAT (Bi). Add B i and B i to G . 3.2. For all pairs (v, v ) of corresponding nodes in B i and B i add a node pv with µ (pv) = ( , µ(v)) and edges pvv, vpv, pv v , v pv to G . 3.3. Add a node bi to G with µ (bi) = . For all v V (Bi) add edges bipv and pvbi. 3.4. Let Ai = V (Bi) V (F) be the nodes of Bi in F. 3.5. Let γi : Ai V (B i) map nodes of F to their copy in B i. 3.6. For each a Ai, let µ (a) = ( , µ(a)) and add edges pγi(a)a and apγi(a) to G . 4. Add node g with µ (g) = to G and for all nodes bi, add edges gbi and big to G . 5. Let CAT(G) = G .
Open Source Code	Yes	1Our code can be found at https://github.com/ocatias/outerplanar GNN.
Open Datasets	Yes	We train on the commonly used ZINC (Gómez-Bombarelli et al., 2018; Sterling & Irwin, 2015) and MOLHIV (Hu et al., 2020) datasets, which contain graphs representing molecules. We supplement these with 7 small datasets (see Table 4) from the OGB collection (Hu et al., 2020). All other datasets are from the Open Graph Benchmark (Hu et al., 2020) based on Molecule Net (Wu et al., 2018).
Dataset Splits	Yes	For each configuration, we separately tune hyperparameters on the validation set and train a model with the best hyperparameters ten times on the training set and evaluate it on the test set. The only exception to this is ZINC250k where we evaluate the final hyperparameter configuration five times, due to the large dataset size.
Hardware Specification	Yes	Our models are implemented in Py Torch-Geometric (Fey & Lenssen, 2019) and trained on a single NVIDIA Ge Force RTX 3080 GPU. We use Wand B (Biewald, 2020) for tracking. The used server has 64 GB of RAM and an 11th Gen Intel(R) Core(TM) i9-11900KF CPU running at 3.50GHz.
Software Dependencies	No	Our models are implemented in Py Torch-Geometric (Fey & Lenssen, 2019) and trained on a single NVIDIA Ge Force RTX 3080 GPU. We use Wand B (Biewald, 2020) for tracking. The text mentions software names like "Py Torch-Geometric" and "Wand B" but does not specify their version numbers.
Experiment Setup	Yes	For ZINC we use a batch size of 128 and an initial learning rate of 10 3 that gets halved if the validation metric does not improve for 20 epochs. The training stops after 500 epochs or if the learning rate dips below 10 5. For all other datasets we train with a fixed learning rate for 100 epochs and a batch size of 128. We use the same hyperparameter grid for all models and provide more details in Appendix E. Table 5: Hyperparameter grids for GIN, GCN and GAT on different datasets.