A Fused Gromov-Wasserstein Approach to Subgraph Contrastive Learning
Authors: Amadou Siaka SANGARE, Nicolas Dunou, Jhony H. Giraldo, Fragkiskos D. Malliaros
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive experiments on benchmark graph datasets, we show that FOSSIL outperforms or achieves competitive performance compared to current state-of-the-art methods. |
| Researcher Affiliation | Academia | Amadou S. Sangare EMAIL LTCI, Télécom Paris Institut Polytechnique de Paris, France Nicolas Dunou EMAIL École des Mines de Saint-Étienne Université Paris Dauphine-PSL, France Jhony H. Giraldo EMAIL LTCI, Télécom Paris Institut Polytechnique de Paris, France Fragkiskos D. Malliaros EMAIL Université Paris-Saclay Centrale Supélec, Inria, France |
| Pseudocode | Yes | In this work, we solve the optimization problem associated with FGWD using the Bregman Alternated Projected Gradient (BAPG) method (Li et al., 2023) given in Alg. 1, App. A. |
| Open Source Code | Yes | The source code of FOSSIL is publicly available 1. 1https://github.com/sangaram/FOSSIL |
| Open Datasets | Yes | We evaluate all methods in four homophilic datasets Cora (Mc Callum et al., 2000), Cite Seer (Sen et al., 2008), Pub Med (Namata et al., 2012), and Co Author-CS (Shchur et al., 2018); and in three heterophilic datasets Actor (Tang et al., 2009), Chameleon, and Squirrel (Rozemberczki et al., 2021). We perform additional experiments in large-scale datasets in App. B. Table 1 shows the statistics of the datasets tested in this work, where H(G) is the homophily of the graph as defined in (Pei et al., 2020). |
| Dataset Splits | Yes | We split the data into a development set (80%) and a test set (20%) once, ensuring that the test set is not used during the hyperparameter optimization process. Therefore, we split the development set into training and validation sets for hyperparameter tuning. Following (Klicpera et al., 2019; Giraldo et al., 2023), we take 20 nodes per class for the training set and the rest into the validation set for Cora, Cite Seer, and Pub Med. For the other datasets, the training set gets 60% of the nodes, and the validation set gets the other 40% in the development set. Finally, we test each method using 100 test seeds to randomly split the dataset into the train and validation, while keeping the original test set. |
| Hardware Specification | Yes | All experiments are conducted on A40 48GB and P100 16GB GPUs. |
| Software Dependencies | No | We implement all methods using Py Torch (Ketkar, 2017) and Py G (Fey & Lenssen, 2019). Our encoder is a two-layer GCN with hidden dimension 1, 024 and output dimension 512, and a PRe LU (He et al., 2015) activation function between them. In all experiments, our method is trained for 300 epochs with the Adam (Kingma & Ba, 2015) optimizer. We optimize the other hyperparameters with the framework Optuna (Akiba et al., 2019). |
| Experiment Setup | Yes | Our encoder is a two-layer GCN with hidden dimension 1, 024 and output dimension 512, and a PRe LU (He et al., 2015) activation function between them. In all experiments, our method is trained for 300 epochs with the Adam (Kingma & Ba, 2015) optimizer. We optimize the other hyperparameters with the framework Optuna (Akiba et al., 2019). The search spaces for hyperparameters are defined as follows: 1) learning rate lr {10 4, 5 10 4, 10 3, 5 10 3, 10 2}; 2) learning rate of fusion module lrf {10 4, 5 10 4, 10 3, 5 10 3, 10 2}; 3) α {[0 : 0.1 : 1]}; 4) the FGWD regularizer β {10 3, 5 10 3, 10 2, 5 10 2, 10 1, 5 10 1, 1, 1.5, 2}; 5) the number of nodes in each sampled subgraph k [10, 30]; 6) the temperature parameter τ {0.2, 0.5, 0.8, 1.0, 1.5, 2.0, 2.5, 3.0}; 7) the GNN dropout parameter p {0.1, 0.2, 0.3, 0.4}, and 8) the fusion MLP dropout pf {0.1, 0.2, 0.3, 0.4}. The value of β2 is fixed to 1. We tune the hyperparameters by performing 100 trials in the development set. |