reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Attribute Inference Attacks for Federated Regression Tasks

Authors: Francesco Diana, Othmane Marfoq, Chuan Xu, Giovanni Neglia, Frédéric Giroire, Eoin Thomas

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We benchmark our proposed attacks against state-of-the-art methods using real-world datasets. The results demonstrate a significant increase in reconstruction accuracy, particularly in heterogeneous client datasets, a common scenario in FL. The efficacy of our model-based AIAs makes them better candidates for empirically quantifying privacy leakage for federated regression tasks.
Researcher Affiliation	Collaboration	1Universit e C ote d Azur 2Inria 3Meta 4CNRS 5I3S 6Amadeus EMAIL, EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1: FL Framework Algorithm 2: Reconstruction of client-c local model by a passive adversary for federated least squares regression Algorithm 3: Reconstruction of client-c local model by an active adversary
Open Source Code	Yes	Code https://github.com/francescodiana99/fedkit-learn
Open Datasets	Yes	Each client stores Sc data points randomly sampled from ACS Income dataset (Ding et al. 2024). Medical (Lantz 2013). This dataset includes 1,339 records and 6 features: age, gender, BMI, number of children, smoking status, and region. Income (Ding et al. 2024). This dataset contains census information from 50 U.S. states and Puerto Rico, spanning from 2014 to 2018.
Dataset Splits	Yes	For all the datasets, each client keeps 90% of its data for training and uses 10% for validation.
Hardware Specification	No	No specific hardware details (like GPU/CPU models, processor types, memory, or cloud instance types) are provided for running the experiments. The text mentions 'mobile phones and IoT devices' as client examples but not as experimental hardware.
Software Dependencies	No	The paper mentions using 'Fed Avg', training a 'neural network model', and the 'Adam optimization algorithm (Kingma and Ba 2015)', but it does not specify version numbers for any software libraries (e.g., Python, PyTorch).
Experiment Setup	Yes	In all the experiments, each client follows Fed Avg to train a neural network model with a single hidden layer of 128 neurons, using Re LU as activation function. The number of communication rounds is fixed to T = 100/E where E is the number of local epochs. Each client participates to all rounds, i.e., Tc = {0, ..., T 1}. The learning rate is tuned for each training scenario (different datasets and number of local epochs), with values provided in (Diana et al. 2024, Appendix B.5). All clients run Fed Avg with 1 epoch and batch size 32.