Do Parameters Reveal More than Loss for Membership Inference?
Authors: Anshuman Suri, Xiao Zhang, David Evans
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically demonstrate its effectiveness in simple settings, showing that it outperforms state-of-the-art reference-model-based and prior white-box attacks (Section 4). Our analyses suggest that the improved auditing performance can be directly attributed to access to the model s parameters. . 4 Experiments To evaluate IHA, we efficiently pre-compute L1(w) to facilitate the computation of L0(w) for any given target record z1. |
| Researcher Affiliation | Academia | Anshuman Suri EMAIL University of Virginia; Xiao Zhang EMAIL CISPA Helmholtz Center for Information Security; David Evans EMAIL University of Virginia |
| Pseudocode | No | The paper describes the methodology using mathematical equations and prose, but no explicit pseudocode or algorithm blocks are provided. |
| Open Source Code | Yes | Our implementation for reproducing all the experiments is available as open-source code at https://github.com/iamgroot42/auditingmi. |
| Open Datasets | Yes | Purchase-100(S). The task for this dataset (Shokri et al., 2017)... MNIST-Odd. We consider the MNIST dataset (Le Cun et al., 1998)... Fashion MNIST. We use the Fashion MNIST (Xiao et al., 2017) dataset... |
| Dataset Splits | No | The paper describes how training data for individual models is sampled ("data from each model is sampled at random from the actual dataset with a 50% probability") and how evaluation is performed (FPR/TPR on members/non-members), but it does not specify explicit training/validation/test splits (e.g., 80/10/10 split or specific sample counts) for the datasets themselves for model training. |
| Hardware Specification | No | The paper mentions that 'the Hessian is too large to store on our GPU for IHA and is thus stored on the CPU,' but it does not provide specific models or specifications for the GPU or CPU used for the experiments. |
| Software Dependencies | No | The paper mentions the use of machine learning frameworks and libraries implicitly through the mathematical notation and discussion of SGD, but it does not explicitly list any software dependencies with specific version numbers (e.g., Python 3.8, PyTorch 1.9). |
| Experiment Setup | Yes | All of our models are trained with momentum (µ = 0.9) and regularization (α = 5e 4), with a learning rate λ = 0.01. |