Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
MaxCutPool: differentiable feature-aware Maxcut for pooling in graph neural networks
Authors: Carlo Abate, Filippo Maria Bianchi
ICLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We consider three different tasks to demonstrate the effectiveness of Max Cut Pool. The code to reproduce the reported results is publicly available1. The details of the architectures used in each experiment and the hyperparameter selection procedure are described in Appendix C. We computed a MAXCUT partition with a simple GNN... We compared our model against... Results are shown in Tab. 2. Performance is computed in terms of the percentage of cut edges: the higher, the better. We conducted a preliminary ANOVA test (p-value 0.05) for each dataset followed by a pairwise Tukey-HSD test (p-value 0.05) to group models whose performance is not significantly different. Those belonging to the top-performing group are colored green. The ANOVA test failed on ENZYMES, PROTEINS, MUTAG, and DD, meaning that the Table 3: Mean and standard deviations of the graph classification accuracy. Overall, the Max Cut Pool-E variant, which satisfies expressiveness conditions, exhibits similar or better performance compared to Max Cut Pool across most datasets. In contrast, the performance decline observed in the Max Cut Pool-NL variant demonstrates the importance of the auxiliary loss. |
| Researcher Affiliation | Academia | Carlo Abate Alma Mater Studiorum University of Bologna Fondazione Istituto Italiano di Tecnologia EMAIL Filippo Maria Bianchi UiT the Arctic University of Norway NORCE Norwegian Research Centre AS EMAIL |
| Pseudocode | Yes | The proposed algorithm is based on a Breadth First Search (BFS) of the graph and is detailed in the pseudo-code in Algorithm 1. Algorithm 1 Pseudo-code for the assignment scheme to the supernodes |
| Open Source Code | Yes | The code to reproduce the reported results is publicly available1. https://github.com/NGMLGroup/Max Cut Pool |
| Open Datasets | Yes | We considered 9 graphs generated via the Py GSP library (Defferrard et al., 2017), including bipartite graphs such as the Grid2D and Ring, and 7 graphs from the GSet dataset (Ye, 2003), including random, planar, and toroidal graphs, typically used as benchmarks for evaluating MAXCUT algorithms (details in App. D.1). As graph classification datasets we consider 8 TUD datasets (COLLAB, DD, NCI1, ENZYMES, MUTAG, Mutagenicity, PROTEINS, and REDDIT-BINARY) (Morris et al., 2020), the Graph Classification Benchmark Hard (GCB-H) (Bianchi et al., 2022), and EXPWL1 (Bianchi & Lachi, 2023)... In addition, we introduce a novel dataset consisting of 5, 000 multipartite graphs... The construction of the Multipartite dataset and a further discussion about its properties are reported in Appendix D.2. The specific instance of the dataset used in our experiments has 10 centers, 500 graphs per center, and a maximum of 20 nodes per cluster, and is available online6. https://zenodo.org/doi/10.5281/zenodo.11616515. The datasets are the heterophilic graphs introduced by Platonov et al. (2023) and are loaded with the API provided by Py Torch Geometric 10. |
| Dataset Splits | Yes | The datasets were split via a 10-fold cross-validation procedure. The training dataset was further partitioned into a 90-10% train-validation random split. This approach is similar to the procedure described by Errica et al. (2020). ... The results are computed on the 10 public folds of these datasets. The nodes of each graph are already split in train, validation, and test across 10 different folds. |
| Hardware Specification | Yes | All measurements are done on an Nvidia RTX A6000. |
| Software Dependencies | No | The pooling methods Top-k, Diffpool, DMoN, Graclus, and Min Cut Pool are taken from Py Torch Geometric (Fey & Lenssen, 2019). For k-MIS we used the official implementation 2. For ECPool, we used the efficient parallel implementation 3 proposed by Landolfi (2022). For NDP we adapted to Py Torch the original Tensorflow implementation 4. All pooling layers were used with the default hyperparameters. While these mention software, they do not provide specific version numbers for the key software components like Python, PyTorch, CUDA, or the mentioned libraries themselves, only citing papers. |
| Experiment Setup | Yes | The model was trained for 2000 epochs with Adam optimizer (Kingma & Ba, 2015), with the initial learning rate set to 8e 4. We used a learning rate scheduler that reduces by 0.8 the learning rate when the auxiliary loss does not improve for 100 epochs. ... The models are trained using a batch size of 32 for 1000 epochs, using the Adam optimizer with an initial learning rate of 1e 4. We used an early stopping that monitors the validation loss with a patience of 300 epochs. ... The node classifier was trained for 20, 000 epochs, using the Adam optimizer with an initial learning rate of 5e 4. We used a learning rate scheduler that reduces by 0.5 the learning rate when the validation loss does not improve for 500 epochs. We used an early stopping that monitors the validation loss with a patience of 2, 000 epochs. ... The Score Net is configured with the following hyperparameters: Number of Het MP layers and number of features in each layer. We use the notation [32, 16, 8]... Activation function of the Het MP layers. As the default, we use Tan H. Number of layers and features in the MLP. As default, we use [16, 16]. Activation function of the MLP. As the default, we use Re LU. Smoothness hyperparameter δ. As default, we use 2. Auxiliary loss weight β. As default, we use 1. Tables 5, 6, 7, and 8 further detail specific hyperparameter configurations for different tasks and datasets. |