Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]

Backdoor Attack on Propagation-based Rumor Detectors

Authors: Di Jin, Yujun Zhang, Bingdao Feng, Xiaobao Wang, Dongxiao He, Zhen Wang

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on three real-world rumor datasets demonstrate that our framework effectively undermines the performance of propagation-based rumor detectors and is transferable across different architectures. In this section, we conduct empirical studies to answer the following research questions: RQ1: Is IBAttack effective/evasive on RPTs? RQ2: Is IBAttack effective and transferable across different rumor detectors? RQ3: What is the impact of node important measures on IBAttack? RQ4: What is the impact of trigger size on IBAttack? RQ5: What is the impact of poisoning rate on IBAttack? Experimental Settings Datasets. We conduct experiments on three real-world rumor datasets: Twitter16 (Ma, Gao, and Wong 2017), Twitter15 (Ma, Gao, and Wong 2017) and Pheme (Zubiaga, Liakata, and Procter 2017). Table 1 presents detailed statistics.
Researcher Affiliation Academia 1College of Intelligence and Computing, Tianjin University, Tianjin, China 2Guangdong Laboratory of Artiffcial Intelligence and Digital Economy (SZ), Shenzhen, China 3School of Cybersecurity, Northwestern Polytechnical University, Xi an, Shaanxi, China EMAIL, EMAIL
Pseudocode Yes Algorithm 1: IBAttack Input: Graph dataset C, target label yt, attack budget M Output: Backdoored dataset C , ω parameters of feature generator
Open Source Code No The paper does not provide any specific links to source code repositories, nor does it contain explicit statements about the release of source code in supplementary materials or upon publication.
Open Datasets Yes We conduct experiments on three real-world rumor datasets: Twitter16 (Ma, Gao, and Wong 2017), Twitter15 (Ma, Gao, and Wong 2017) and Pheme (Zubiaga, Liakata, and Procter 2017).
Dataset Splits Yes The dataset-splitting rules are applied to all methods. Following the settings of Bi GCN, we split each dataset into two parts: 80% is used as a training dataset, and 20% is used as a testing dataset. From the non-target classes in the training set, we randomly select 5% to be backdoored.
Hardware Specification No The paper does not mention any specific hardware details such as GPU models, CPU types, or memory specifications used for conducting the experiments.
Software Dependencies No The paper does not provide specific version numbers for any software components, libraries, or frameworks used in the experiments.
Experiment Setup Yes The trigger size is set as 10% and 20% of the number of graph nodes during the training and testing phases, respectively. The training epoch and learning rate are set to 100 and 0.001. All baselines use the optimal parameters from the original papers. More details can be found in the appendix.