reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Online Fraud Detection via Test-Time Retrieval-Based Representation Enrichment

Authors: Yiran Qiao, Ningtao Wang, Yuncong Gao, Yang Yang, Xing Fu, Weiqiang Wang, Xiang Ao

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on three large-scale real-world datasets demonstrate the superiority of TRE. By consistently incorporating information from the nearest neighbors, TRE demonstrates high adaptability and surpasses existing methods in performance.
Researcher Affiliation	Collaboration	1Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China 2Key Lab of AI Safety, Chinese Academy of Sciences, Beijing 100094, China 3University of Chinese Academy of Sciences, CAS, Beijing 100049, China 4Independent Researcher EMAIL, EMAIL, EMAIL, name EMAIL, EMAIL, EMAIL, EMAIL Xiang Ao is also at CASMINO Ltd., Suzhou 215000, China.
Pseudocode	No	The paper describes steps and processes in paragraph text and uses diagrams (e.g., Figure 2) but does not include any clearly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code	No	The paper mentions using a third-party open-source library: "For implementation, we apply the offline deployment of Facebook AI Similarity Search (FAISS) (Johnson, Douze, and J egou 2019), an open-source library designed for efficient similarity search and clustering of dense embeddings." It does not provide access to the source code for the methodology described in this paper.
Open Datasets	No	We collected three industrial datasets from a mobile payment platform on the premise of complying with security and privacy policies, covering fraud detection, account takeover (ATO) detection, and money laundering detection tasks.
Dataset Splits	Yes	The training data spans from 09/01/2022 to 09/30/2022, while the testing data spans from 02/01/2023 to 02/28/2023. We also selected data from six months after the training period as the out-of-time (OOT) testing dataset. We validate our training set by extracting 20% random samples. Since the ATO and money laundry datasets are highly imbalanced, we conducted a 1:10 undersampling on the negative examples during training.
Hardware Specification	Yes	All results were derived using a V100-16GB GPU, with Epoch Time measured in GPU seconds.
Software Dependencies	No	Our TRE model is implemented using Py Torch (Paszke et al. 2019) in the Linux environment.
Experiment Setup	Yes	Early stopping is applied with a patience of 5 epochs. The embedding size and the batch size are set to 128 and 512, respectively. We perform the dropout at each layer as 0.1, and the total parameter count of the Retriever and the Predictor are 50.1K, as 1% of a Transformer encoder. Adam W (Loshchilov and Hutter 2018) is used as the optimizer with an initial learning rate of 1e-4.