reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Global Attribute-Association Pattern Aggregation for Graph Fraud Detection

Authors: Mingjiang Duan, Da He, Tongya Zheng, Lingxiang Jia, Mingli Song, Xinyu Wang, Zunlei Feng

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments comparing our approach with 24 methods on 7 datasets demonstrate that the proposed method achieves SOTA performance.
Researcher Affiliation	Collaboration	1State Key Laboratory of Blockchain and Data Security, Zhejiang University 2Bangsheng Technology Co,Ltd. 3Big Graph Center, Hangzhou City University 4Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security. Authors Mingjiang Duan and Xinyu Wang are affiliated with both Zhejiang University (academic) and Bangsheng Technology Co,Ltd. (industry), indicating a collaboration.
Pseudocode	No	The paper describes the methodology using textual descriptions and mathematical equations but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Code https://github.com/Atwood Duan/GAAP
Open Datasets	Yes	Experiments were conducted on seven real-world fraud detection datasets, which are given as follows: Yelp Chi, a dataset aimed at identifying abnormal reviews that unfairly promote or demote products or businesses on Yelp.com (Rayana and Akoglu 2015). Amazon, a dataset containing users who write fake reviews in the musical instruments category on Amazon.com (Mc Auley and Leskovec 2013). T-Finance, a financial transaction fraud dataset (Tang et al. 2022). T-Social, a dataset for detecting abnormal accounts in social networks (Tang et al. 2022). Elliptic, designed for illicit Bitcoin transaction detection (Weber et al. 2019). Tolokers, a dataset for detecting fraudulent users on the Toloka crowd-sourcing platform (Platonov et al. 2023). DGraph-Fin, a credit default detection dataset provided by the Finvolution Group, constructed using guarantor contact information (Huang et al. 2022b).
Dataset Splits	No	The paper mentions 'Early stopping was performed on the validation set, and the scores on the test set were reported,' implying train/validation/test splits, but it does not provide specific percentages, sample counts, or a detailed splitting methodology within the main text. It defers to GADBench for more details, which is an external reference.
Hardware Specification	Yes	All experiments were run on an Intel(R) Xeon(R) Gold 5318Y CPU @ 2.10GHz with multiple NVIDIA A6000 GPUs.
Software Dependencies	No	The paper states: 'These GNN-related algorithms are mostly implemented using the DGL framework provided by GADBench (Tang et al. 2024).' While DGL is named, no specific version number is provided for it or any other software component.
Experiment Setup	Yes	In each trial on every dataset, a set of hyperparameters was randomly selected from a predefined search space for each model. Early stopping was performed on the validation set, and the scores on the test set were reported. For our proposed method, the number of bins T was selected from a range of 4 to 40, the number of GNN layers L ranged from 1 to 4, and the mini-batch size varied from 32 to 5000 depending on the dataset.