Do Parameters Reveal More than Loss for Membership Inference?

Authors: Anshuman Suri, Xiao Zhang, David Evans

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically demonstrate its effectiveness in simple settings, showing that it outperforms state-of-the-art reference-model-based and prior white-box attacks (Section 4). Our analyses suggest that the improved auditing performance can be directly attributed to access to the model s parameters. . 4 Experiments To evaluate IHA, we efficiently pre-compute L1(w) to facilitate the computation of L0(w) for any given target record z1.
Researcher Affiliation Academia Anshuman Suri EMAIL University of Virginia; Xiao Zhang EMAIL CISPA Helmholtz Center for Information Security; David Evans EMAIL University of Virginia
Pseudocode No The paper describes the methodology using mathematical equations and prose, but no explicit pseudocode or algorithm blocks are provided.
Open Source Code Yes Our implementation for reproducing all the experiments is available as open-source code at https://github.com/iamgroot42/auditingmi.
Open Datasets Yes Purchase-100(S). The task for this dataset (Shokri et al., 2017)... MNIST-Odd. We consider the MNIST dataset (Le Cun et al., 1998)... Fashion MNIST. We use the Fashion MNIST (Xiao et al., 2017) dataset...
Dataset Splits No The paper describes how training data for individual models is sampled ("data from each model is sampled at random from the actual dataset with a 50% probability") and how evaluation is performed (FPR/TPR on members/non-members), but it does not specify explicit training/validation/test splits (e.g., 80/10/10 split or specific sample counts) for the datasets themselves for model training.
Hardware Specification No The paper mentions that 'the Hessian is too large to store on our GPU for IHA and is thus stored on the CPU,' but it does not provide specific models or specifications for the GPU or CPU used for the experiments.
Software Dependencies No The paper mentions the use of machine learning frameworks and libraries implicitly through the mathematical notation and discussion of SGD, but it does not explicitly list any software dependencies with specific version numbers (e.g., Python 3.8, PyTorch 1.9).
Experiment Setup Yes All of our models are trained with momentum (µ = 0.9) and regularization (α = 5e 4), with a learning rate λ = 0.01.