Private Multi-Task Learning: Formulation and Applications to Federated Learning
Authors: Shengyuan Hu, Steven Wu, Virginia Smith
TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we find that our method provides improved privacy/utility trade-offs relative to global baselines across common federated learning benchmarks. Finally, we explore the performance of our approach on common federated learning benchmarks (Section 5). Our results show that we can retain the accuracy benefits of MTL in these settings relative to global baselines while still providing meaningful privacy guarantees. We empirically evaluate our private MTL solver on common federated learning benchmarks Caldas et al. (2018). We first demonstrate the superior privacy-utility trade-offthat exists when training our private MTL method compared with training a single global model (Section 5.2). We also compare our method with simple finetuning exploring the results of performing local finetuning after learning an MTL objective vs. a global objective (Section 5.3). |
| Researcher Affiliation | Academia | Shengyuan Hu EMAIL Carnegie Mellon University Zhiwei Steven Wu EMAIL Carnegie Mellon University Virginia Smith EMAIL Carnegie Mellon University |
| Pseudocode | Yes | Algorithm 1 PMTL: Private Mean-Regularized MTL 1: Input: m, T, λ, η, {w0 1, , w0 m}, ew0 = 1 m Pm k=1 w0 k 2: for t = 0, , T 1 do 3: Global Learner randomly selects a set of tasks St and broadcasts the mean weight ewt 4: for k St in parallel do 5: Each client updates its weight wk for E iterations, ok is the last iteration task k is selected 6: Each client sends gt+1 k = wt+1 k wt k back to the global learner. 7: end for 8: Global Learner computes a noisy aggregator of the weights ewt+1 = ewt + 1 |St| k St gt+1 k min 1, γ gt+1 k 2 + N(0, σ2Id d) 9: end for 10: Output w1, , wm as differentially private personalized models 11: Client Update(w) 12: for j = 0, , E 1 do 13: Task learner performs SGD locally w = w η( wlk(w) + λ(w ewt)) 14: end for |
| Open Source Code | Yes | Our code is publicly available at: https://github.com/s-huu/PMTL |
| Open Datasets | Yes | We empirically evaluate our private MTL solver on common federated learning benchmarks Caldas et al. (2018). We summarize the details of the datasets and models we used in our empirical study in Table 2. Our experiments include both convex (Logistic Regression) and non-convex (CNN) loss objectives on both text (Stack Overflow) and image (Celeb A and FEMNIST) datasets. FEMNIST (Cohen et al., 2017; Caldas et al., 2018) 205 4-layer CNN 62-class image classification Stack Overflow (tff) 400 Logistic Regression 500-class tag prediction Celeb A (Liu et al., 2015; Caldas et al., 2018) 515 4-layer CNN Binary image classification |
| Dataset Splits | No | The paper mentions using common federated learning benchmarks and discusses partitioning data among clients (tasks), but does not explicitly provide the training/test/validation splits for these datasets. It refers to 'test accuracy' and 'validation accuracy' implying such splits exist, but does not detail them. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU types, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions "Tensorflow federated" and that the code makes "use of the FL implementation from the public repo of Laguel et al. (2021) and Li et al. (2021)." However, it does not provide specific version numbers for these software components or any other libraries used. |
| Experiment Setup | Yes | For all experiments, we evaluate the test accuracy and privacy parameter of our private MTL solver given a fixed clipping bound γ, variance of Gaussian noise σ2, and communication rounds T. All experiments are performed on common federated learning benchmarks as a natural application of multi-task learning. We provide a detailed description of datasets and models in Appendix A.4. In all our experiments, we subsample 100 different tasks for each round, i.e. q = 100, to perform local training as well as involved in global aggregation. For FEMNIST and Celeb A, we choose σ {0.02, 0.05, 0.1} and γ {0.2, 0.5, 1}. For Stack Overflow, we choose σ {0.01, 0.05, 0.1} and γ {0.1, 0.5, 1}. We summarize both utility and privacy performance for different hyperparameters below. |