reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Rethinking Contrastive Learning in Graph Anomaly Detection: A Clean-View Perspective

Authors: Di Jin, Jingyi Cao, Xiaobao Wang, Bingdao Feng, Dongxiao He, Longbiao Wang, Jianwu Dang

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on five benchmark datasets validate the effectiveness of our approach. 4 Experiments 4.1 Experimental Setup Datasets. We conduct experiments on five benchmark datasets: Cora, Citeseer, Pub Med [Sen et al., 2008], Citation and ACM [Tang et al., 2008]. 4.3 Ablation Study To verify the effectiveness of different modules, we conduct three types of ablation studies.
Researcher Affiliation	Academia	1College of Intelligence and Computing, Tianjin University, Tianjin, China 2Key Laboratory of Artificial Intelligence Application Technology, Qinghai Minzu University, China 3Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
Pseudocode	No	The paper describes the methodology using prose and mathematical equations but does not include a distinct pseudocode or algorithm block.
Open Source Code	No	The paper does not provide any statements regarding the availability of open-source code, nor does it include links to code repositories.
Open Datasets	Yes	Datasets. We conduct experiments on five benchmark datasets: Cora, Citeseer, Pub Med [Sen et al., 2008], Citation and ACM [Tang et al., 2008].
Dataset Splits	No	The paper describes how anomalies are injected into the datasets but does not provide specific details on training, validation, and test splits for the experimental data.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependency details with version numbers, such as programming languages, libraries, or frameworks used.
Experiment Setup	Yes	Parameter Settings. In CVGAD, we employ a one-layer GCN to aggregate information from subgraphs, with both subgraph embeddings and node embeddings mapped to 64-dimensional vectors. The size of the subgraph is set to 4, and the learning rate remains fixed at 0.001. Additionally, we set the value of γ to 0.8. Five iterations are performed on all datasets. Specifically, for Cora, Cite Seer, and Pub Med, we perform edge removal every 100 epochs, conducting a total of 500 epochs. For Citation and ACM, we perform 1000 epochs of edge removal. In the refine training phase, we conduct 200 epochs on Cora, Cite Seer, and Pub Med, and 400 epochs on Citation and ACM. Besides, we implement 300 rounds of score calculation in all datasets. In addition, we set K to 0.015 for ACM and 0.01 for other datasets.