reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Cross-Domain Graph Anomaly Detection via Test-Time Training with Homophily-Guided Self-Supervision

Authors: Delaram Pirhayatifard, Arlei Silva

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments across multiple cross-domain settings demonstrate that GADT3 significantly outperforms existing approaches, achieving average improvements of over 8.2% in AUROC and AUPRC compared to the best competing model. The source code for GADT3 is available at https://github.com/delaramphf/GADT3-Algorithm. We compare our solution against both graph domain adaptation and graph anomaly detection approaches using multiple cross-domain datasets. Our analysis is complemented with ablation studies on key components of the model, such as source training, NSAW, and class-aware regularization.
Researcher Affiliation	Academia	Delaram Pirhayatifard EMAIL Department of Electrical and Computer Engineering Rice University Arlei Silva EMAIL Department of Computer Science Ken Kennedy Institute for Responsible AI and Computing for Global Impact Rice University
Pseudocode	No	The paper describes the methodology using mathematical formulations and descriptive text across sections 3 and 4, but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	The source code for GADT3 is available at https://github.com/delaramphf/GADT3-Algorithm.
Open Datasets	Yes	Datasets. We apply six datasets from diverse domains, such as online shopping reviews, including Amazon (AMZ) (Mc Auley & Leskovec, 2013), Yelp Chi (Rayana & Akoglu, 2015), Yelp Hotel (HTL), and Yelp Res (RES) (Ding et al., 2021), and social networks, including Reddit (RDT) (Kumar et al., 2018) and Facebook (FB)(Leskovec & Mcauley, 2012). For our cross-domain analysis, we examine scenarios where feature spaces are homogeneous (same features) and heterogeneous (different features) across domains. To demonstrate the scalability of our model, we additionally report results on three large-scale graphs: Amazon-all (AMZ-all), Yelp Chi-all (YC-all) (Hamilton et al., 2017), and T-Finance (TF) (Tang et al., 2022) in Table 3.
Dataset Splits	No	The paper describes using source and target datasets for training and testing/adaptation, respectively. While an early-stopping mechanism is mentioned, implying a validation process, specific training/validation/test splits (e.g., percentages or sample counts) for any individual dataset within the source or target domains are not explicitly provided.
Hardware Specification	Yes	All the experiments were conducted on an NVIDIA A40 GPU with 48GB.
Software Dependencies	Yes	We developed GADT3 using Python 3.11.9 and Py Torch 2.1.0.
Experiment Setup	Yes	The core model is a 2-layer Graph SAGE GNN with weight parameters optimized via the Adam optimizer (Kingma & Ba, 2014) with a dropout rate of 0.7. The source model underwent training for 100 epochs with early stopping employed to determine the target epochs. Both the source and target models were trained with a learning rate of 0.001. We set λ = 0.001, λreg = 0.1, λs = 0.001, p = 40, and α = 20.