Cape: Context-Aware Prompt Perturbation Mechanism with Differential Privacy
Authors: Haoqi Wu, Wei Dai, Wang Li, Qiang Yan
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments across multiple datasets, along with ablation studies, demonstrate that Cape achieves a better privacy-utility tradeoff compared to prior state-of-the-art works. |
| Researcher Affiliation | Industry | Haoqi Wu 1 Wei Dai 1 Li Wang 1 Qiang Yan 1 1Tik Tok. Correspondence to: Haoqi Wu <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Equal width Bucketing: Bucket Input: Utility score vector u, the number of buckets Nb Output: A set of buckets B with different tokens. ... Algorithm 2 Cape Mechanism: R Input: Prompt x = {t1, t2, ..., tn}, device model M, embedding table e, importance factors λL, λD, bucket number Nb, and budget ϵ > 0 Output: Perturbed prompt ˆx R(x) |
| Open Source Code | No | The paper does not provide a direct link to a code repository or an explicit statement of code release for the methodology described in this paper. It mentions |
| Open Datasets | Yes | For the former, we follow prior works (Yue et al., 2021; Chen et al., 2022) to use two GLUE datasets with privacy implications. 1) SST-2: This contains sentiment annotations for movie reviews...; 2) QNLI: This is a dataset containing sentence pairs for binary classification... For the latter, we follow (Tong et al., 2023) to use Wikitext-103-v1, a large-scale dataset derived from Wikipedia articles for language modeling tasks. |
| Dataset Splits | Yes | For the former, we follow prior works (Yue et al., 2021; Chen et al., 2022) to use two GLUE datasets with privacy implications. 1) SST-2: This contains sentiment annotations for movie reviews, which is used to perform sentiment classification (positive or negative); 2) QNLI: This is a dataset containing sentence pairs for binary classification (entailment/not entailment). We use accuracy as the metric. ... For the latter, we follow (Tong et al., 2023) to use Wikitext-103-v1, a large-scale dataset derived from Wikipedia articles for language modeling tasks. ... on the validation set of SST-2 and QNLI datasets. |
| Hardware Specification | Yes | All the experiments are carried out on one Debian 11 machine equipped with one Intel Xeon Platinum 8260 CPU (6 cores and 2.40GHz), 16GB of RAM and 4 Nvidia Tesla-V100-SXM2-32GB GPUs. |
| Software Dependencies | No | The paper mentions 'one Debian 11 machine' as the operating system but does not provide specific version numbers for other software dependencies like programming languages (e.g., Python), libraries (e.g., PyTorch, TensorFlow), or CUDA. |
| Experiment Setup | Yes | By default, we set λL = 0.5, λD = 1.0 and Nb = 50. We run inference on the original data as non-private baseline. ... (default temperature of 0.5). |