PeFLL: Personalized Federated Learning by Learning to Learn
Authors: Jonathan Scott, Hossein Zakerinia, Christoph H Lampert
ICLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we report on our experimental evaluation. The values reported in every table and plot are given as the mean together with the standard deviation across three random seeds. |
| Researcher Affiliation | Academia | Jonathan Scott Institute of Science and Technology Austria (ISTA) EMAIL Hossein Zakerinia Institute of Science and Technology Austria (ISTA) EMAIL Christoph H. Lampert Institute of Science and Technology Austria (ISTA) EMAIL |
| Pseudocode | Yes | Pseudocode of the specific steps is provided in Algorithms 1 and 2. |
| Open Source Code | Yes | We provide the code as supplemental material. We will publish it when the anonymity requirement is lifted. |
| Open Datasets | Yes | For our experiments, we use three datasets that are standard benchmarks for FL: CIFAR10/CIFAR100 (Krizhevsky, 2009) and FEMNIST (Caldas et al., 2018). [...] Additional experiments on the Shakespeare dataset (Caldas et al., 2018) are provided in Appendix A. |
| Dataset Splits | Yes | The hyperparameters for all methods are tuned using validation data that was held out from the training set (10,000 samples for CIFAR10 and CIFAR100, spread across the clients, and 10% of each client s data for FEMNIST). |
| Hardware Specification | No | The paper mentions support from 'Scientific Computing (Sci Comp)' and that a ResNet20 implementation was used, but does not provide specific details on the CPU, GPU, or memory used for experiments. |
| Software Dependencies | No | The paper mentions the use of 'SGD' as the optimizer and implies the use of a deep learning framework (e.g., PyTorch for ResNet implementation) but does not specify exact version numbers for any software dependencies. |
| Experiment Setup | Yes | We train all methods, except Local, for 5000 rounds with partial client participation. For CIFAR10 and CIFAR100 client participation is set to 5% per round... The optimizer used for training at the client is SGD with a batch size of 32, a learning rate chosen via grid search and momentum set to 0.9. The batch size used for computing the descriptor is also 32. [...] the dimension of the embedding vectors is l = n/4 and the number of client SGD steps is k = 50. The regularization parameters for the embedding network and hypernetwork are set to λh = λv = 10-3, while the output regularization is λθ = 0. |