The Case for Learned Provenance-based System Behavior Baseline
Authors: Yao Zhu, Zhenyuan Li, Yangyang Wei, Shouling Ji
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our evaluation demonstrates the method s accuracy and adaptability in anomaly path mining, significantly advancing the state-of-the-art in handling and analyzing provenance graphs for anomaly detection. Our comprehensive evaluation, conducted on large-scale and open-source datasets, confirms the effectiveness and efficiency of our provenance graph embedding method. The results highlight its accuracy and adaptability in real-time anomaly path mining tasks, demonstrating its potential to significantly enhance anomaly detection capabilities. Section 4. Experiments. |
| Researcher Affiliation | Academia | 1College of Computer Science and Technology, Zhejiang University, Hangzhou, China 2College of Software Technology, Zhejiang University, Ningbo, China. |
| Pseudocode | No | The paper describes methods and processes in text, such as the 'tag-propagation framework consists of four main stages: tag initialization, propagation, removal, and alert triggering', but does not present these as structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | 1Available at https://github.com/Addo Zhu/ behavior_baseline |
| Open Datasets | Yes | In our experiments, we utilized datasets from the DARPA Transparent Computing (TC) dataset (tra, 2015.2), which contains millions of benign and hundreds of malicious events collected from platforms with diverse background activities, providing provenance-rich data capturing system events and dependencies over time. This dataset includes a series of realistically simulated Advanced Persisitent Threats (APT), such as malware execution, privilege escalation, remote exploitation, and data exfiltration. We primarily use the E3-CADETS dataset, constructing a training dataset with 1,042k system events and a testing dataset with 26k events. Additionally, we demonstrate the adaptability of our method on other dataests in Appendix D.3. |
| Dataset Splits | Yes | We primarily use the E3-CADETS dataset, constructing a training dataset with 1,042k system events and a testing dataset with 26k events. |
| Hardware Specification | Yes | We conducted all experiments on an Ubuntu 18.04.6 LTS server with an Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz, 251Gi B of memory, and three NVIDIA Ge Force RTX 3090 GPUs. |
| Software Dependencies | No | We utilize the Tensor Flow library to construct neural network models, enabling flexible modifications of layer configurations to implement various architectures such as MLP, LSTM and CNN. This approach facilitates the evaluation of different machine learning models in the provenance graph embedding and anomaly path mining task. |
| Experiment Setup | Yes | Through controlled variable experiments, we ultimately employed the Adam optimizer with a learning rate of 0.001, and 200 training epochs for each model configuration. we evaluated the learned model s performance based on the prediction accuracy of event regular scores. A prediction is considered true if the difference between the predicted regular score and the true score is less than a threshold of 0.2, which is selected because the frequencies of negative saples we constructed are generally below this value. For the anomaly path mining task, we use path-level precision, recall, and F1 score as evaluation metrics. In the comparative experiments, due to the varying detection granularity across methods, we adopt node-level metrics for consistency. ... we use an L1 kernel regularizer with a coefficient of 0.001, the Adam optimizer (learning rate = 0.001), and the Mean Squared Error (MSE) loss function. To study the impacts of batch sizes, we use different batch sizes from 32 to 2048 to train MLP models on the E3-CADETS dataset |