reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Structural Entropy Guided Unsupervised Graph Out-Of-Distribution Detection

Authors: Yue Hou, He Zhu, Ruomei Liu, Yingke Su, Jinxiang Xia, Junran Wu, Ke Xu

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on realworld datasets validate the effectiveness of SEGO, demonstrating superior performance over state-of-the-art baselines in OOD detection. Specifically, our method achieves the best performance on 9 out of 10 dataset pairs, with an average improvement of 3.7% on OOD detection datasets, significantly surpassing the best competitor by 10.8% on the Free Solv/Tox Cast dataset pair.
Researcher Affiliation	Academia	Yue Hou1,2, He Zhu1, Ruomei Liu1, Yingke Su2, Jinxiang Xia1, Junran Wu1*, Ke Xu1 1State Key Laboratory of Complex & Critical Software Environment, Beihang University 2Shen Yuan Honors College, Beihang University EMAIL
Pseudocode	No	The paper describes methods and theoretical foundations using mathematical formulas and descriptive text but does not include any explicitly labeled 'Algorithm' or 'Pseudocode' blocks.
Open Source Code	Yes	Code https://github.com/name-is-what/SEGO
Open Datasets	Yes	For OOD detection, we employ 10 pairs of datasets from two mainstream graph data benchmarks (i.e., TUDataset (Morris et al. 2020) and OGB (Hu et al. 2020)) following GOOD-D (Liu et al. 2023). We also conduct experiments on anomaly detection settings, where 5 datasets from TUDataset (Morris et al. 2020) are used for evaluation, where the samples in minority class or real anomalous class are viewed as anomalies, while the rest are as normal data.
Dataset Splits	Yes	For OOD detection, we employ 10 pairs of datasets from two mainstream graph data benchmarks (i.e., TUDataset (Morris et al. 2020) and OGB (Hu et al. 2020)) following GOOD-D (Liu et al. 2023).
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. It mentions using GNNs, but no hardware specifications.
Software Dependencies	No	The paper mentions using GIN (Xu et al. 2019) as the backbone for the GNN encoder, but it does not specify version numbers for GIN or any other software dependencies like Python, PyTorch, or CUDA.
Experiment Setup	Yes	In the experimental setup, the height k of the coding tree is consistently set to 5, in alignment with the GNN encoder. Self-adaptiveness Strength θ. To analyze the sensitivity of θ for SEGO, we alter the value of θ from 0 to 1. The AUC w.r.t different selections of θ is plotted in Fig. 6.