reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Suitable is the Best: Task-Oriented Knowledge Fusion in Vulnerability Detection

Authors: Jingjing Wang, Minhuan Huang, yuanping nie, Xiang Li, Qianjin Du, Wei Kong, Huan Deng, Xiaohui Kuang

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that KF-GVD outperforms SOTAs on function-level and statement-level vulnerability detection across various target tasks, with an average increase of 40.9% in precision and 26.1% in recall.
Researcher Affiliation	Academia	Jingjing Wang Institute of Systems Engineering, Academy of Military Sciences, PLA EMAIL Minhuan Huang Institute of Systems Engineering, Academy of Military Sciences, PLA EMAIL Yuanpin Nie Institute of Systems Engineering, Academy of Military Sciences, PLA EMAIL Xiang Li Institute of Systems Engineering, Academy of Military Sciences, PLA EMAIL Qianjin Du Department of Computer Science and Technology, Tsinghua University EMAIL Wei Kong School of Information Science and Engineering, Zhejiang Sci-Tech University EMAIL Huan Deng Institute of Systems Engineering, Academy of Military Sciences, PLA EMAIL Xiaohui Kuang Institute of Systems Engineering, Academy of Military Sciences, PLA EMAIL
Pseudocode	No	The paper describes the method and model architecture in prose and figures (e.g., Figure 3, Figure 5) but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	Yes	Does the paper provide open access to the data and code, with sufﬁcient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justiﬁcation: The dataset has been uploaded to the supplementary materials, and the detail can be found in Appendix C.
Open Datasets	Yes	The source task dataset consists of 80% CWE-119 and CWE-416 type vulnerability information extensively collected from 13 real-world C++ projects from NVD3. The remaining 20% is sourced from academic security defects and synthetic data provided by SARD4.
Dataset Splits	Yes	Train:Validation:Test 8:1:1
Hardware Specification	Yes	We conducted all experiments on a workstation equipped with a Quadro RTX 6000 GPU.
Software Dependencies	Yes	CPGs corresponding to source code ﬁles were generated using Joern version 1.1.1033. We employed a pre-trained Word2Vec model... The SAGPool model deployed in both source and target tasks were implemented using Py Torch version 1.4.0 and CUDA version 10.2.
Experiment Setup	Yes	Model Parameter Setting Min count 0.001 Size 30 Window 5 Embedding dim 300 Hidden dim 32 Activation funcion Relu Learning rate 0.0001 Optimizer Adam Train:Validation:Test 8:1:1