Hidden No More: Attacking and Defending Private Third-Party LLM Inference

Authors: Rahul Krishna Thomas, Louai Zahran, Erica Choi, Akilesh Potti, Micah Goldblum, Arka Pal

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct our experiments on two state-of-the-art open-source LLMs, Gemma-2-2B-IT (Team et al., 2024) and Llama-3.1-8B-Instruct (Grattafiori et al., 2024). We test on samples from the Fineweb-Edu dataset (Penedo et al., 2024). We evaluate on 1000 held out prompts, and our results are shown in Table 1.
Researcher Affiliation Collaboration 1Ritual AI 2Stanford University 3Columbia University. Correspondence to: Arka Pal (Project Lead) <EMAIL>.
Pseudocode Yes Algorithm 1 Vocabulary-Matching Attack Algorithm 2 Cascade Single Layer Forward Pass Algorithm 3 Generalized Vocabulary-Matching Attack Algorithm 4 Attack on Sequence Dimension Permuted LLM Hidden States Algorithm 5 Attack on Hidden Dimension Permuted LLM Hidden States Algorithm 6 Attack on Factorized-2D Permuted LLM Hidden States Algorithm 7 Comp Nodei Single Layer Pre-Pass Algorithm 8 Attn Nodejk Single Layer Attention-Pass Algorithm 9 Comp Nodei Single Layer Post-Pass
Open Source Code Yes Our implementation is available at https://github. com/ritual-net/vma-external.
Open Datasets Yes We conduct our experiments on two state-of-the-art open-source LLMs, Gemma-2-2B-IT (Team et al., 2024) and Llama-3.1-8B-Instruct (Grattafiori et al., 2024). We test on samples from the Fineweb-Edu dataset (Penedo et al., 2024).
Dataset Splits Yes For each layer of interest, we tune ϵ by performing a ternary search on a small training set of 50 prompts from Fine Web, to determine the optimal L1-threshold under which predicted tokens are accepted as matches. We evaluate on 1000 held out prompts, and our results are shown in Table 1.
Hardware Specification Yes We run our experiments on Paperspace machines with 16 v CPU and 64GB RAM, with the CPU model being Intel Xeon Gold 6226R CPU @ 2.90GHz. All machines are colocated in the same region with an average bandwidth of 2 Gbps and latency of 0.38 ms.
Software Dependencies No We benchmark against two recent SMPC schemes for LLM inference, MPCFormer (Li et al., 2023a) and Puma (Dong et al., 2023b). For MPCFormer, we modify the Crypten implementation to use public rather than private weights, to match our open-weights setting. Puma data is taken from Dong et al. (2023b), as it is built on SPU with its own set of optimizations. We also mention the bitsandbytes library (Bits And Bytes, 2025) and Ray (Moritz et al., 2018). However, no specific version numbers are provided for these software dependencies.
Experiment Setup Yes For each layer of interest, we tune ϵ by performing a ternary search on a small training set of 50 prompts from Fine Web, to determine the optimal L1-threshold under which predicted tokens are accepted as matches. We evaluate on 1000 held out prompts, and our results are shown in Table 1. Due to computational constraints, each evaluation prompt was truncated to a maximum of 50 tokens; however, small-scale experiments with prompts over 200 tokens demonstrated that our results generalize to longer prompt settings vocab-matching still perfectly decodes hidden states into their input tokens.