Attention-based Conditional Random Field for Financial Fraud Detection

Authors: Xiaoguang Wang, Chenxu Wang, Luyue Zhang, Xiaole Wang, Mengqin Wang, Huanlong Liu, Tao Qin

IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that ACRF-RNN outperforms the state-of-the-art methods by 15.28% in KS and 4.04% in Recallm. Data and code are available at: https://github.com/XNet Lab/ACRF-RNN.git.
Researcher Affiliation Academia Xiaoguang Wang1 , Chenxu Wang 1,2,3,4 , Luyue Zhang1 , Xiaole Wang1 , Mengqin Wang1 , Huanlong Liu1 and Tao Qin2 1School of Software Engineering, Xi an Jiaotong University 2Mo E Key Lab of Intelligent Networks and Network Security (INNS), Xi an Jiaotong University 3Interdisciplinary Research Center of Frontier science and technology, Xi an Jiaotong University 4Shaanxi Joint (Key) Laboratory for Artificial Intelligence, Xi an Jiaotong University EMAIL, EMAIL, EMAIL
Pseudocode No The paper describes the methodology using mathematical equations and textual explanations, but it does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes Data and code are available at: https://github.com/XNet Lab/ACRF-RNN.git.
Open Datasets Yes This work presents a real-world dataset to evaluate the performance of ACRF-RNN. Extensive experiments show that ACRF-RNN outperforms the state-of-the-art methods by 15.28% in KS and 4.04% in Recallm. Data and code are available at: https://github.com/XNet Lab/ACRF-RNN.git.
Dataset Splits Yes In real business scenarios, financial fraud detection is conducted once a year, using models trained on historical data to detect fraud companies for the current year. Therefore, we split the real-world dataset by year. Specifically, we use the data from 2010 to 2016 for training and use the data from 2017 as validation to adjust hyperparameters. Next, with fixed hyperparameters, we train the model on data from 2010 to T, and test it on data at the (T+1)-th year, where T {2017, 2018, 2019}. Table 1 details the partitions.
Hardware Specification No The paper states: "all the experiments are run on a Ubuntu 16.04 LTS server." This is too general and does not provide specific hardware details like GPU/CPU models or memory.
Software Dependencies Yes We implement ACRF-RNN based on Py Torch 1.12.1 with Python 3.8, and all the experiments are run on a Ubuntu 16.04 LTS server.
Experiment Setup Yes Grid search is employed to select the optimal hyper-parameters based on the validation set. All parameters are initialized using the Kaiming initialization and are trained using the Adam optimizer with an initial learning rate of 0.01. The optimal time window size w is 5, the iteration of CRF K is set to 5, the sub-sequence embedding dimension d is set to 64. The attention layer consists of M = 3 attention heads and its hidden size d is set to 32. And the penalty coefficient of fraud sample abalance is set to 0.15.