A Self-Explainable Heterogeneous GNN for Relational Deep Learning
Authors: Francesco Ferrini, Antonio Longa, Andrea Passerini, Manfred Jaeger
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that in the context of relational databases, our approach effectively identifies informative meta-paths that faithfully capture the model s reasoning mechanisms. It significantly outperforms existing methods in both synthetic and real-world scenarios. Our experimental evaluation seeks to address the following research questions: Q1 Can MPS-GNN recover the correct meta-path when increasing the setting complexity? Q2 Does MPS-GNN outperform existing approaches in tasks over real world relational databases? Q3 Is MPS-GNN self-explainable? We compared MPS-GNN with approaches that don t require predefined meta-paths, handle numerous relations, and incorporate node features in learning. The identified competitors include: MLP, to test the sufficiency of target node features alone; GCN (Kipf & Welling, 2016), a baseline non-relational model; RGCN (Schlichtkrull et al., 2017), extending GCN for multi-relational graphs, with distinct parameters for each edge type; HGN (Lv et al., 2021a), a heterogeneous GNN model extending GAT for multiple relations; GTN (Yun et al., 2019a), which transforms input graphs into different meta-path graphs where node representations are learned; Fast-GTN (Yun et al., 2022b), an optimized GTN variant; R-HGNN (Yu et al., 2021), a relation-aware GNN using cross-relation message passing; and MP-GNN (Ferrini et al., 2024), the original meta-path GNN supporting only existentially quantified meta-paths. We implemented our model using Py Torch Geometric, and used the competitors code from their respective papers for comparison. For training MPS-GNN, we used a 70/20/10 split for training, validation, and testing, respectively, and reported the test results for the model selected based on its validation performance. We employed F1 as evaluation metric to account for the unbalancing in many of the datasets. The paper includes Table 1 and Table 2 showing F1 metric results for synthetic and real-world datasets, respectively. |
| Researcher Affiliation | Academia | Francesco Ferrini EMAIL University of Trento, Italy Antonio Longa EMAIL University of Trento, Italy Andrea Passerini EMAIL University of Trento, Italy Manfred Jaeger EMAIL Aalborg University, Denmark |
| Pseudocode | Yes | Algorithm 1 outlines the whole MPS-GNN procedure for the single meta-path case (in practice, a K beam search is used and multiple meta-paths are learned). Algorithm 1 MPS-GNN Learning procedure learn MPS-GNN(G, R, Y, LMAX, η) |
| Open Source Code | Yes | The code is freely available at 2. 2https://github.com/francescoferrini/MPS-GNN |
| Open Datasets | Yes | Our approach is particularly useful for predictive tasks in relational databases with multiple tables, where features for a target entity may involve statistics from related tables. To address the second research question, we thus focused on three relational databases with many tables: EICU, a medical database with 31 tables, where we predict patient stay duration in the e ICU, modeled as binary node classification by thresholding duration at 20 hours to achieve two balanced classes.; MONDIAL, a geographic database where the task is predicting whether a country s religion is Christian; and Ergast F1, containing Formula 1 data, where the task is predicting the winner of a race in a binary classification task where target nodes are represented by a combination of race and pilot. EICU Medical database with 31 tables (node types)4 from Johnson et al. (2021). URL https://eicu-crd.mit.edu MONDIAL Database 5 containing data from multiple geographical web data sources (May, 1999). URL http://dbis.informatik.uni-goettingen.de/Mondial. Ergast F1 Database 6 containing Formula 1 races from the 1950 season to the present day. URL https://relational-data.org/dataset/Ergast F1 Recently, a novel benchmark, rel-bench Robinson et al. (2024), has been introduced. |
| Dataset Splits | Yes | For training MPS-GNN, we used a 70/20/10 split for training, validation, and testing, respectively, and reported the test results for the model selected based on its validation performance. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. It only mentions execution times in Table 10 without specifying the hardware on which these times were measured. |
| Software Dependencies | No | We implemented our model using Py Torch Geometric, and used the competitors code from their respective papers for comparison. The optimizer is omitted from the table as it is Adam for all models. lr denotes the learning rate, wd represents the weight decay, and Patience indicates the early stopping patience (if applicable). The paper mentions software like "Py Torch Geometric" and "Adam" but does not specify any version numbers for these or any other key software components. |
| Experiment Setup | Yes | Table 8: Hyperparameters of competitors and MPS-GNN for the real-world datasets. The optimizer is omitted from the table as it is Adam for all models. lr denotes the learning rate, wd represents the weight decay, and Patience indicates the early stopping patience (if applicable). # layers, Embedding dim., lr, wd, # epochs, Patience, Loss are listed for each model including MPS-GNN. |