TopInG: Topologically Interpretable Graph Learning via Persistent Rationale Filtration

Authors: Cheng Xin, Fan Xu, Xin Ding, Jie Gao, Jiaxin Ding

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate TOPING s effectiveness in tackling key challenges, such as handling variform rationale subgraphs, balancing predictive performance with interpretability, and mitigating spurious correlations. Results show that our approach improves upon state-of-the-art methods on both predictive accuracy and interpretation quality.
Researcher Affiliation Academia 1Department of Computer Science, Rutgers University, Piscataway, NJ, USA 2School of Computer Science, Shanghai Jiao Tong University, Shanghai, China. Correspondence to: Cheng Xin <EMAIL>, Jiaxin Ding <EMAIL>.
Pseudocode No The paper describes methods using prose and mathematical equations but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not contain an explicit statement about releasing the source code for the methodology described, nor does it provide a link to a code repository.
Open Datasets Yes Datasets. We consider eight benchmark datasets commonly used in the graph explainability literature, categorized into three types: Single Motif, Multiple Motif, and Real Dataset. The first two consist of synthetic datasets. Single Motif includes BA-2Motifs (Luo et al., 2020), BAHouse Grid (Amara et al., 2023), SPmotif0.5 and SPmotif0.9 (Wu et al., 2022). These datasets contain graphs with a single type of motif or structural pattern repeated throughout. Multiple Motif includes BA-House And Grid, BA-House Or Grid (Bui et al., 2024), and BA-House Or Gridn Rnd. The last one is a synthetic dataset we create for verifying the variform rationale challenge for existing intrinsic methods (see Appendix E for more details). Real Dataset include Mutag (Luo et al., 2020) and Benzene (Sanchez-Lengeling et al., 2020).
Dataset Splits Yes Data Splits. For BA synthetic datasets, we follow the previous work (Miao et al., 2022; Chen et al., 2024; Bui et al., 2024) to split them into three sets(80%/10%/10%). For SPmotifs and real datasets, we use the default splits.
Hardware Specification Yes All experiments were conducted on a single RTX 4090 GPU.
Software Dependencies No The paper mentions using codebase from (Zhang et al., 2022) for computations, but does not provide specific version numbers for software dependencies such as Python, PyTorch, or CUDA.
Experiment Setup Yes Backbone Architecture. We use a two-layer GIN (Xu et al., 2019) with 64 hidden dimensions and 0.3 dropout ratio for all baselines. We use a three-layer CINpp (Giusti et al., 2023) with 64 hidden dimensions and 0.15/0.3 dropout ratio for TOPING. ... All the results are averaged over 5 runs tested with different random seeds. ... The λ of information regularizer is set to be 1. As for topological constraint, we set the coefficient to 0.01 to achieve the best performance, which aligning with Figure 4. In practice, we simply fix w = 0.5, µ1 = 0.25, µ2 = 0.75, and initialize r1 = r2 = 0.25.