Exploiting Edge Features in Graph-based Learning with Fused Network Gromov-Wasserstein Distance
Authors: Junjie Yang, Matthieu Labeau, Florence d'Alché-Buc
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we conduct a series of studies around the proposed distance; then, we assess its relevance in supervised graph prediction problems, including the Fingerprint to Molecule task and Metabolite Identification task. A Python implementation of the algorithms presented for these tasks can be found on Github2. We leave additional experiments on clustering for Appendix D. 4.1 Study of the FNGW Distance 4.1.1 Edge-featured Graph Classification We represent input graphs to be classified by kernels based on the FNGW-distance; hence, we choose to use a variety of datasets where both node and edge features are present: Cuneiform (Kriege et al., 2018), MUTAG (Debnath et al., 1991; Kriege & Mutzel, 2012), PTC-MR (Helma et al., 2001; Kriege & Mutzel, 2012), BZR-MD, COX2-MD, DHFR-MD and ER-MD (Sutherland et al., 2003; Kriege & Mutzel, 2012). All these datasets were collected by Kersting et al. (2016). Our classifier is a Support Vector Machine (SVM) for which the kernel matrix K is computed with Ki,j = exp( γFNGW(gi, gj)). |
| Researcher Affiliation | Academia | Junjie Yang EMAIL LTCI, Télécom Paris IP Paris, France Matthieu Labeau EMAIL LTCI, Télécom Paris IP Paris, France Florence d Alché-Buc EMAIL LTCI, Télécom Paris IP Paris, France |
| Pseudocode | Yes | Algorithm 1 Computation of the FNGW Distance by CGD Algorithm 2 Computation of FNGW Barycenter with BCD Algorithm 3 Line-Search in Conditional Gradient Descent Algorithm 4 Unmixing Problem Solver Algorithm 5 Stochastic Graph Dictionary Learning |
| Open Source Code | Yes | A Python implementation of the algorithms presented for these tasks can be found on Github2. https://github.com/chunchiehy/fngw |
| Open Datasets | Yes | We represent input graphs to be classified by kernels based on the FNGW-distance; hence, we choose to use a variety of datasets where both node and edge features are present: Cuneiform (Kriege et al., 2018), MUTAG (Debnath et al., 1991; Kriege & Mutzel, 2012), PTC-MR (Helma et al., 2001; Kriege & Mutzel, 2012), BZR-MD, COX2-MD, DHFR-MD and ER-MD (Sutherland et al., 2003; Kriege & Mutzel, 2012). All these datasets were collected by Kersting et al. (2016). Our classifier is a Support Vector Machine (SVM) for which the kernel matrix K is computed with Ki,j = exp( γFNGW(gi, gj)). All the datasets can be downloaded from https://ls11-www.cs.tu-dortmund.de/staff/ morris/graphkerneldatasets. To conduct our experiment, we have chosen the widely used QM9 molecule dataset (Ruddigkeit et al., 2012; Ramakrishnan et al., 2014). As in Brouard et al. (2016a); Brogat-Motte et al. (2022), we use the Metabolite Identification dataset processed by Dührkop et al. (2015). ... The dataset can be downloaded from https://zenodo.org/record/804241#.Yi9bz S_p Nh E, which is released under Creative Commons Attribution 4.0 International license. |
| Dataset Splits | Yes | We build five training-test splits using different seeds. Each split has 131,885 training samples and 2,000 test samples. The dataset is split into a training set of size n = 3000 and a test set of size ntest = 1022. More details about the dataset can be found in Appendix C.3. For all methods, we conduct nested cross-validation with 50 iterations of the outer CV loop and 10-fold inner CV. In each outer CV loop, the dataset is split into a training set and a test set, with 10% of the data held out for the latter. The splitting is kept consistent across all methods for a fair comparison. We use a separate validation set (1/5 of the training set) to select the hyperparameters α and β of FNGW loss. |
| Hardware Specification | Yes | The experiments are conducted on an Intel(R) Xeon(R) CPU E5-2650 v2. |
| Software Dependencies | No | A Python implementation of the algorithms presented for these tasks can be found on Github2. We utilize the Network X implementation of GED. The molecules are kekulized 3 by the chemical software RDKit 4. |
| Experiment Setup | Yes | For the classifiers based on FNGW or FGW, the coefficient γ presented in the Gram matrix computation is cross-validated within {2i | i J 10, 10K} while the number of iteration K in WL labeling is searched in J0, 4K for MUTAG and PTC-MR or J0, 3K for the other datasets (0 means no WL labeling is used) except Cuneiform. For FNGW, we cross-validate 7 values of the trade-off parameter α via a logspace search in [0, 0.5] and the same strategy is applied to β, while a total of 15 values are drawn from logspaces [0, 0.5] and [0.5, 1] to cross-validate α for FGW. For the kernel-based methods, decay factor λ of RWK is cross-validated from {10i | i J 6, 2K}, the number of iterations of WLK is cross-validated in J1, 10K while the one of PROPAK is chosen from {1, 3, 5, 8, 10, 15, 20}. For GK, the CV range of the graphlet size is {3, 4, 5} and the one of the precision level ϵ and the confidence δ is {0.1, 0.05}. For NSPDK, the maximum considered radius r between vertices is cross-validated within J0, 5K, and neighborhood depth d is chosen from {3, 4, 5, 6, 8, 10, 12, 14, 20}. Finally, the regularization coefficient C of the SVM is cross-validated within {10i | i J 7, 7K} for all the methods except for the case MUTAG-FNGW where the value 107 is not included. Table 7: Hyper-parameter searching ranges during the cross-validation on Fing2Mol dataset. We choose the ridge regularization parameter (λ = 10 4) and the diffusion rate (τ = 0.6) following Brogat-Motte et al. (2022). We use a separate validation set (1/5 of the training set) to select the hyperparameters α and β of FNGW loss. α and β are chosen from {0.1, 0.33, 0.5, 0.67} with the constraint α+β < 1. |