reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

SkipPool: Improved Sparse Hierarchical Graph Pooling with Differentiable Exploration

Authors: Sarith Imaduwage

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We tested SKIPPOOL & its variant SKIPPOOL-FULL each against matching pooling methods. Proposed methods achieve new state-of-the-art performance on the majority of benchmark datasets. Moreover, we show that they are more robust at capturing the graph signal space consequently preserving more graph-level information than their counterparts. 4 Experiments As mentioned in (Grattarola et al. 2022; Bianchi and Lachi 2024) some of pooling layers can directly specify the pooling ratio and some can t. SKIPPOOL belongs to the former category while SKIPPOOL-FULL belongs to the latter. Hence we evaluate both against different matching baselines separately, to answer the following questions: (Q1) How do both SKIPPOOL and SKIPPOOL-FULL perform against other SSP methods (eg: against TOPKPOOL or SAGPOOL )?, (Q2) How do both SKIPPOOL and SKIPPOOL-FULL perform against the broader realm of pooling methods (eg: against DIFFPOOL which is considered to be dense pooling methods)?, (Q3) How do SKIPPOOL and SKIPPOOL-FULL overcome the limitations of SSP methods with their skipping, i.e., sampling capabilities? Datasets In order to analyze the ability to capture complex hierarchical structures we employ graphs from different domains chosen from 6 benchmark datasets. We use protein datasets: PROTEINS (Borgwardt et al. 2005; Dobson and Doig 2003), D&D (Dobson and Doig 2003; Shervashidze et al. 2011), chemical compound datasets: NCI1, NCI109 (Wale and Karypis 2006; Shervashidze et al. 2011), and FRANKENSTEIN (Orsini, Frasconi, and De Raedt 2015) dataset which contains a set of molecular graphs, and the social network dataset REDDIT-BINARY (Yanardag and Vishwanathan 2015).
Researcher Affiliation	Academia	Sarith Imaduwage Independent Researcher EMAIL. The author is listed as an 'Independent Researcher' with a personal email, which does not indicate affiliation with a corporation or private-sector lab (Industry) nor a collaboration. By elimination, it is classified as Academia, reflecting individual research contribution.
Pseudocode	No	The paper describes the methodology using narrative text, mathematical equations, and definitions (e.g., Definitions 1, 2, 3, Proposition 1) but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Code https://github.com/Sarith Rav I/Skip Pool
Open Datasets	Yes	Datasets In order to analyze the ability to capture complex hierarchical structures we employ graphs from different domains chosen from 6 benchmark datasets. We use protein datasets: PROTEINS (Borgwardt et al. 2005; Dobson and Doig 2003), D&D (Dobson and Doig 2003; Shervashidze et al. 2011), chemical compound datasets: NCI1, NCI109 (Wale and Karypis 2006; Shervashidze et al. 2011), and FRANKENSTEIN (Orsini, Frasconi, and De Raedt 2015) dataset which contains a set of molecular graphs, and the social network dataset REDDIT-BINARY (Yanardag and Vishwanathan 2015). To answer Q3 we use 6 graphs of varying sizes, namely Grid, Ring extracted from the Py GSP (Defferrard et al. 2024) library and Airplane, Car, Guitar, Person from the Model Net40 (Wu et al. 2015) dataset.
Dataset Splits	Yes	In each experiment, we use stratified train/val/test split, and specifications related to splitting/repetitions are in the appendix. Best models are selected using cross-validation on validation data.
Hardware Specification	No	The paper mentions that SKIPPOOL & SKIPPOOL-FULL are CPU-bound algorithms and refers to GPU usage, but does not provide specific details such as CPU/GPU models, memory, or any other hardware specifications used for the experiments.
Software Dependencies	No	The paper mentions using a 'GCN (Kipf and Welling 2017) layers' and refers to the 'Py GSP (Defferrard et al. 2024) library' for some graphs, but it does not specify exact version numbers for any software libraries, frameworks, or programming languages used in the experiments.
Experiment Setup	Yes	Hyperparameter spaces considered for each experiment are in Appendix E. In each experiment, we use stratified train/val/test split, and specifications related to splitting/repetitions are in the appendix. Best models are selected using cross-validation on validation data. The number of total epochs is 100k unless specified otherwise, and all the models are trained with early stopping subject to patience criteria. The model training stops (early) if the validation loss doesn t improve for 50 epochs, and the model is restored to the epoch with the least validation loss on which test data is evaluated. In the temperature annealing, Ta, Tb, and R are set to 10,0.1,1k respectively. When the pooling ratio is 0.5 or 0.1 the stride length F for SKIPPOOL is 3, 11 respectively, and for the SKIPPOOL-FULL stride length that pools closest to the required output size is picked. For the scoring of nodes in SKIPPOOL and variants we used the scoring method used in SAGPOOL: f(xi) = p GNN(xi).