ProSec: Fortifying Code LLMs with Proactive Security Alignment
Authors: Xiangzhe Xu, Zian Su, Jinyao Guo, Kaiyuan Zhang, Zhenting Wang, Xiangyu Zhang
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that models trained with PROSEC are 25.2% to 35.4% more secure compared to previous work without degrading models utility. ... We demonstrate the effectiveness of PROSEC on the Purple Llama (Bhatt et al., 2023) secure coding benchmark. The models trained with the dataset synthesized by PROSEC are 25.2% 35.4% more secure than those trained with the Safe Coder dataset. We further validate that PROSEC does not harm the utility of code LLMs. We conduct thorough ablation studies to justify the design decisions in PROSEC. |
| Researcher Affiliation | Academia | 1Department of Computer Science, Purdue University, IN, USA 2Department of Computer Science, Rutgers University, NJ, USA. Correspondence to: Xiangzhe Xu <EMAIL>, Zian Su <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Vulnerability-inducing instruction generation |
| Open Source Code | Yes | We publish a dataset of synthesized vulnerabilityinducing instructions that can effectively expose the weakness of code LLMs. PROSEC and the dataset are available at https://github.com/PurCL/ProSec. |
| Open Datasets | Yes | Seed Instruction-Tuning Dataset We use the code-related part of Infinity-Instruct 2 (BAAI, 2024) as our seed instruction dataset for data synthesis. ... Test Dataset We use Purple Llama (Bhatt et al., 2023) as the test dataset for code model safety. ... We use the multi-lingual version of Humaneval (Chen et al., 2021; Guo et al., 2024a) and the multi-lingual version of MBPP (Austin et al., 2021) (denoted as MXEval (Athiwaratkun et al., 2022)) as the test dataset for utility. ... We publish a dataset of synthesized vulnerabilityinducing instructions that can effectively expose the weakness of code LLMs. PROSEC and the dataset are available at https://github.com/PurCL/ProSec. |
| Dataset Splits | No | The paper describes how the preference dataset is constructed and selected, but it does not specify explicit training, validation, or test splits for this dataset that would be needed to reproduce their experimental results. It mentions using subsets of existing benchmarks for evaluation but not the specific splits of their own generated data. |
| Hardware Specification | Yes | We run the training of PROSEC on 2 NVIDIA A100-40G GPUs. |
| Software Dependencies | No | The paper mentions several preference optimization methods (DPO, IPO, ORPO, SimPO) and LoRA for parameter-efficient training, along with their hyperparameters. However, it does not provide specific version numbers for these libraries or other underlying software such as Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | The major preference optimization-related hyperparameters in our experiments are shown in Table 6. For training, we set the total batch size to 64. We adopt Lo RA (Hu et al., 2021) for parameter-efficient training of the target model. The rank r = 8 and α = 16 for all our experiments. We run the training of PROSEC on 2 NVIDIA A100-40G GPUs. ... Warm-up Training for Influence Score We train each target model on Dsec for 1k steps and leverage checkpoints of every 100 steps to compute the training dynamics for Dnorm data influence score computation. |