Provable Membership Inference Privacy
Authors: Zachary Izzo, Jinsung Yoon, Sercan O Arik, James Zou
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this work, we propose a novel privacy notion, membership inference privacy (MIP)... We give a precise characterization of the relationship between MIP and DP... As a proof of concept, we also provide a simple algorithm for guaranteeing MIP without needing to guarantee DP. Lastly, this paper focused on developing on the theoretical principles and guarantees of MIP. Taking advantage of the relaxed requirements of MIP to develop practical algorithms, and systematic empirical evaluation of these algorithms, is an important direction for future work. |
| Researcher Affiliation | Collaboration | Zachary Izzo EMAIL NEC Labs America Jinsung Yoon EMAIL Google Cloud AI Sercan Ö. Arık EMAIL Google Cloud AI James Zou EMAIL Department of Biomedical Data Science Stanford University |
| Pseudocode | Yes | Algorithm 1 MIP via noise addition Require: Private dataset D, σ estimation budget B, MIP parameter η Dtrain Random Split(D, 1/2) # Estimate σ if an a priori bound is not known for i = 1, . . . , B do D(i) train Random Split(Dtrain, 1/2) θ(i) A(D(i) train) end for for j = 1, . . . , d do B PB i=1 θ(i) j σj 1 B PB i=1(θ(i) j θj)M 1/M # Add appropriate noise to the base algorithm s output U Unif({u Rd : u σ,M = 1}) r Laplace 6.16 return A(Dtrain) + X |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code or provide a link to a code repository. |
| Open Datasets | No | The paper uses a generic dataset 'D' for theoretical discussions and proofs without referring to any specific, publicly available dataset used for empirical evaluation. |
| Dataset Splits | No | The paper describes theoretical concepts and an algorithm, but does not perform experiments on specific datasets, therefore no dataset split information is provided. Algorithm 1 includes 'Dtrain Random Split(D, 1/2)' as a theoretical step, not an empirical dataset split. |
| Hardware Specification | No | The paper focuses on theoretical development and does not describe any experimental setup or specific hardware used for computations. |
| Software Dependencies | No | The paper focuses on theoretical development and does not mention any specific software dependencies or their version numbers. |
| Experiment Setup | No | The paper is theoretical in nature, proposing a new privacy notion and an algorithm without detailing concrete experimental setups, hyperparameters, or training configurations for a practical application. Figure 1 shows theoretical noise level comparisons, not empirical experiment results. |