Attribute Inference Attacks for Federated Regression Tasks

Authors: Francesco Diana, Othmane Marfoq, Chuan Xu, Giovanni Neglia, Frédéric Giroire, Eoin Thomas

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We benchmark our proposed attacks against state-of-the-art methods using real-world datasets. The results demonstrate a significant increase in reconstruction accuracy, particularly in heterogeneous client datasets, a common scenario in FL. The efficacy of our model-based AIAs makes them better candidates for empirically quantifying privacy leakage for federated regression tasks.
Researcher Affiliation Collaboration 1Universit e C ote d Azur 2Inria 3Meta 4CNRS 5I3S 6Amadeus EMAIL, EMAIL, EMAIL
Pseudocode Yes Algorithm 1: FL Framework Algorithm 2: Reconstruction of client-c local model by a passive adversary for federated least squares regression Algorithm 3: Reconstruction of client-c local model by an active adversary
Open Source Code Yes Code https://github.com/francescodiana99/fedkit-learn
Open Datasets Yes Each client stores Sc data points randomly sampled from ACS Income dataset (Ding et al. 2024). Medical (Lantz 2013). This dataset includes 1,339 records and 6 features: age, gender, BMI, number of children, smoking status, and region. Income (Ding et al. 2024). This dataset contains census information from 50 U.S. states and Puerto Rico, spanning from 2014 to 2018.
Dataset Splits Yes For all the datasets, each client keeps 90% of its data for training and uses 10% for validation.
Hardware Specification No No specific hardware details (like GPU/CPU models, processor types, memory, or cloud instance types) are provided for running the experiments. The text mentions 'mobile phones and IoT devices' as client examples but not as experimental hardware.
Software Dependencies No The paper mentions using 'Fed Avg', training a 'neural network model', and the 'Adam optimization algorithm (Kingma and Ba 2015)', but it does not specify version numbers for any software libraries (e.g., Python, PyTorch).
Experiment Setup Yes In all the experiments, each client follows Fed Avg to train a neural network model with a single hidden layer of 128 neurons, using Re LU as activation function. The number of communication rounds is fixed to T = 100/E where E is the number of local epochs. Each client participates to all rounds, i.e., Tc = {0, ..., T 1}. The learning rate is tuned for each training scenario (different datasets and number of local epochs), with values provided in (Diana et al. 2024, Appendix B.5). All clients run Fed Avg with 1 epoch and batch size 32.