Global Attribute-Association Pattern Aggregation for Graph Fraud Detection
Authors: Mingjiang Duan, Da He, Tongya Zheng, Lingxiang Jia, Mingli Song, Xinyu Wang, Zunlei Feng
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments comparing our approach with 24 methods on 7 datasets demonstrate that the proposed method achieves SOTA performance. |
| Researcher Affiliation | Collaboration | 1State Key Laboratory of Blockchain and Data Security, Zhejiang University 2Bangsheng Technology Co,Ltd. 3Big Graph Center, Hangzhou City University 4Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security. Authors Mingjiang Duan and Xinyu Wang are affiliated with both Zhejiang University (academic) and Bangsheng Technology Co,Ltd. (industry), indicating a collaboration. |
| Pseudocode | No | The paper describes the methodology using textual descriptions and mathematical equations but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code https://github.com/Atwood Duan/GAAP |
| Open Datasets | Yes | Experiments were conducted on seven real-world fraud detection datasets, which are given as follows: Yelp Chi, a dataset aimed at identifying abnormal reviews that unfairly promote or demote products or businesses on Yelp.com (Rayana and Akoglu 2015). Amazon, a dataset containing users who write fake reviews in the musical instruments category on Amazon.com (Mc Auley and Leskovec 2013). T-Finance, a financial transaction fraud dataset (Tang et al. 2022). T-Social, a dataset for detecting abnormal accounts in social networks (Tang et al. 2022). Elliptic, designed for illicit Bitcoin transaction detection (Weber et al. 2019). Tolokers, a dataset for detecting fraudulent users on the Toloka crowd-sourcing platform (Platonov et al. 2023). DGraph-Fin, a credit default detection dataset provided by the Finvolution Group, constructed using guarantor contact information (Huang et al. 2022b). |
| Dataset Splits | No | The paper mentions 'Early stopping was performed on the validation set, and the scores on the test set were reported,' implying train/validation/test splits, but it does not provide specific percentages, sample counts, or a detailed splitting methodology within the main text. It defers to GADBench for more details, which is an external reference. |
| Hardware Specification | Yes | All experiments were run on an Intel(R) Xeon(R) Gold 5318Y CPU @ 2.10GHz with multiple NVIDIA A6000 GPUs. |
| Software Dependencies | No | The paper states: 'These GNN-related algorithms are mostly implemented using the DGL framework provided by GADBench (Tang et al. 2024).' While DGL is named, no specific version number is provided for it or any other software component. |
| Experiment Setup | Yes | In each trial on every dataset, a set of hyperparameters was randomly selected from a predefined search space for each model. Early stopping was performed on the validation set, and the scores on the test set were reported. For our proposed method, the number of bins T was selected from a range of 4 to 40, the number of GNN layers L ranged from 1 to 4, and the mini-batch size varied from 32 to 5000 depending on the dataset. |