reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

A Fused Gromov-Wasserstein Approach to Subgraph Contrastive Learning

Authors: Amadou Siaka SANGARE, Nicolas Dunou, Jhony H. Giraldo, Fragkiskos D. Malliaros

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through extensive experiments on benchmark graph datasets, we show that FOSSIL outperforms or achieves competitive performance compared to current state-of-the-art methods.
Researcher Affiliation	Academia	Amadou S. Sangare EMAIL LTCI, Télécom Paris Institut Polytechnique de Paris, France Nicolas Dunou EMAIL École des Mines de Saint-Étienne Université Paris Dauphine-PSL, France Jhony H. Giraldo EMAIL LTCI, Télécom Paris Institut Polytechnique de Paris, France Fragkiskos D. Malliaros EMAIL Université Paris-Saclay Centrale Supélec, Inria, France
Pseudocode	Yes	In this work, we solve the optimization problem associated with FGWD using the Bregman Alternated Projected Gradient (BAPG) method (Li et al., 2023) given in Alg. 1, App. A.
Open Source Code	Yes	The source code of FOSSIL is publicly available 1. 1https://github.com/sangaram/FOSSIL
Open Datasets	Yes	We evaluate all methods in four homophilic datasets Cora (Mc Callum et al., 2000), Cite Seer (Sen et al., 2008), Pub Med (Namata et al., 2012), and Co Author-CS (Shchur et al., 2018); and in three heterophilic datasets Actor (Tang et al., 2009), Chameleon, and Squirrel (Rozemberczki et al., 2021). We perform additional experiments in large-scale datasets in App. B. Table 1 shows the statistics of the datasets tested in this work, where H(G) is the homophily of the graph as defined in (Pei et al., 2020).
Dataset Splits	Yes	We split the data into a development set (80%) and a test set (20%) once, ensuring that the test set is not used during the hyperparameter optimization process. Therefore, we split the development set into training and validation sets for hyperparameter tuning. Following (Klicpera et al., 2019; Giraldo et al., 2023), we take 20 nodes per class for the training set and the rest into the validation set for Cora, Cite Seer, and Pub Med. For the other datasets, the training set gets 60% of the nodes, and the validation set gets the other 40% in the development set. Finally, we test each method using 100 test seeds to randomly split the dataset into the train and validation, while keeping the original test set.
Hardware Specification	Yes	All experiments are conducted on A40 48GB and P100 16GB GPUs.
Software Dependencies	No	We implement all methods using Py Torch (Ketkar, 2017) and Py G (Fey & Lenssen, 2019). Our encoder is a two-layer GCN with hidden dimension 1, 024 and output dimension 512, and a PRe LU (He et al., 2015) activation function between them. In all experiments, our method is trained for 300 epochs with the Adam (Kingma & Ba, 2015) optimizer. We optimize the other hyperparameters with the framework Optuna (Akiba et al., 2019).
Experiment Setup	Yes	Our encoder is a two-layer GCN with hidden dimension 1, 024 and output dimension 512, and a PRe LU (He et al., 2015) activation function between them. In all experiments, our method is trained for 300 epochs with the Adam (Kingma & Ba, 2015) optimizer. We optimize the other hyperparameters with the framework Optuna (Akiba et al., 2019). The search spaces for hyperparameters are defined as follows: 1) learning rate lr {10 4, 5 10 4, 10 3, 5 10 3, 10 2}; 2) learning rate of fusion module lrf {10 4, 5 10 4, 10 3, 5 10 3, 10 2}; 3) α {[0 : 0.1 : 1]}; 4) the FGWD regularizer β {10 3, 5 10 3, 10 2, 5 10 2, 10 1, 5 10 1, 1, 1.5, 2}; 5) the number of nodes in each sampled subgraph k [10, 30]; 6) the temperature parameter τ {0.2, 0.5, 0.8, 1.0, 1.5, 2.0, 2.5, 3.0}; 7) the GNN dropout parameter p {0.1, 0.2, 0.3, 0.4}, and 8) the fusion MLP dropout pf {0.1, 0.2, 0.3, 0.4}. The value of β2 is fixed to 1. We tune the hyperparameters by performing 100 trials in the development set.