Federated Variational Inference: Towards Improved Personalization and Generalization

Authors: Elahe Vedadi, Joshua V. Dillon, Philip Andrew Mansfield, Karan Singhal, Arash Afkanpour, Warren Richard Morningstar

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our model on FEMNIST and CIFAR-100 image classification and show that Fed VI beats the state-of-the-art on both tasks. 5 Implementation and Experimental Evaluation
Researcher Affiliation Industry Elahe Vedadi EMAIL Google Research; Joshua V. Dillon EMAIL Google Research; Philip Andrew Mansfield EMAIL Google Research; Karan Singhal EMAIL Google Research; Arash Afkanpour EMAIL Google Research; Warren Richard Morningstar EMAIL Google Research
Pseudocode Yes Algorithm 1 Fed VI Training
Open Source Code No The paper states, "We implement our Fed VI algorithm in Tensor Flow Federated (TFF)" but does not provide a specific link to the source code for their implementation or an explicit statement of its release.
Open Datasets Yes We evaluate Fed VI algorithm on two different datasets, FEMNIST3 (Caldas et al., 2019) (62-class digit and character classification) and CIFAR-1004 (Krizhevsky et al., 2009) (100-class classification). 3https://www.tensorflow.org/federated/api_docs/python/tff/simulation/datasets/emnist/load_data 4https://www.tensorflow.org/federated/api_docs/python/tff/simulation/datasets/cifar100/load_data
Dataset Splits Yes For FEMNIST dataset with 3400 clients we consider the first 20 clients as non-participating users which are held-out in training to better measure generalization as in (Yuan et al., 2022). At each round of training we select 100 clients uniformly at random without replacement, but with replacement across rounds. For CIFAR-100 with 500 training clients, we set the data of the first 10 clients as held-out data and select 50 clients uniformly at randomly at each round. ... at each epoch we consider the first 50% of each mini-batch as the support set and the other 50% as the query set (i.e, for a mini-batch with 256 data samples the first 128 samples belong to the support set and the rest belong to query set).
Hardware Specification Yes We implement our Fed VI algorithm in Tensor Flow Federated (TFF) and scale up the implementation to NVIDIA Tesla V100 GPUs for hyperparameter tuning.
Software Dependencies No The paper mentions "Tensor Flow Federated (TFF)" as the implementation framework but does not provide specific version numbers for TFF or any other software libraries or dependencies.
Experiment Setup Yes We train Fed VI algorithm on both FEMNIST and CIFAR-100 for 1500 rounds and at each round of training we divide both datasets into mini-batches of 256 data samples and used mini-batch gradient descent algorithm to optimize the objective function. ... We use Stochastic Gradient Descent (SGD) for our client optimizer and SGD with momentum for the server optimizer for all experiments (Reddi et al., 2020). We set the client learning rate equal to 0.03 for CIFAR-100 and 0.02 for FEMNIST dataset, and server learning rate equal to 3.0 with momentum 0.9 for both FEMNIST and CIFAR-100 datasets. ... Fed VI results are reported for τ = 10^-9 for FEMNIST, and τ = 10^-3 for CIFAR-100.