Controlling Federated Learning for Covertness

Authors: Adit Jain, Vikram Krishnamurthy

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical results show that when the learner uses the optimal policy, an eavesdropper can only achieve a validation accuracy of 52% with no information and 69% when it has a public dataset with 10% positive samples compared to 83% when the learner employs a greedy policy. The proposed methods are demonstrated on a novel application, covert federated learning (FL) on a text classification task using large language model embeddings. Our key numerical results are summarized in Table 2.
Researcher Affiliation Academia Adit Jain EMAIL Department of Electrical and Computer Engineering, Cornell University Vikram Krishnamurthy EMAIL Department of Electrical and Computer Engineering, Cornell University
Pseudocode Yes Algorithm 1 Structured Policy Gradient Algorithm Input: Initial Policy Parameters Θ0, Perturbation Parameter ω, K Iterations, Step Size κ, Scale Parameter ρ, Learning cost l, Privacy Cost c Output: Policy Parameters ΘK procedure COMPUTESTATIONARYPOLICY(ω, K, κ, ρ) for n 1 . . . K do Γ Bernoulli( 1 2) 3 |SO||SE| i.i.d. Bernoulli random variables Θ+ n Θn + Γ ω, Θ n Θn Γ ω, ˆl AVGCOST(l, Θn) ˆ l (AVGCOST((l, Θ+ n ) AVGCOST(l, Θ n )) , ˆ c (AVGCOST((c, Θ+ n ) AVGCOST(c, Θ n )) Θn and ξn using (15) end for end procedure procedure AVGCOST(J, Θ) ˆν POLICYFROMPARAMETERS(Θ) T PT t=1 J(ˆν(yt), yt) end procedure
Open Source Code Yes Hate speech classification is still an open problem and the achieved accuracy is barely satisfactory but our aim was to show the application of our formulation. Our source code can be found on this anonymized link.
Open Datasets Yes We use Jigsaw s Unintended Bias in Toxicity Classification dataset for our experimental results. The dataset has 1.8 million public comments from the Civil Comments platform. The dataset was annotated by human raters for toxic conversational attributes, mainly rating the toxicity of each text on a scale of 0 to 1 and sub-categorizing for severe toxicity, obscene, threat, insult, identity attacks, and sexually explicit content. More information about the annotation process can be found on the Kaggle website for this dataset here.
Dataset Splits Yes Each client has 5443 training samples and 1443 validation samples. For the experimental results, we consider N = 45 communication rounds (or queries) and M = 16 successful model updates (which is around 34% of the total queries). The original dataset is imbalanced with 1660540 non-toxic samples and 144334 toxic samples, and for each experimental run, we take a random balanced subset with 144334 toxic and non-toxic samples. The eavesdropper accuracy is calculated using a balanced validation dataset of size 2886.
Hardware Specification No 20 runs of the Hate Speech Classification task took around 23 hours, whereas, within the same time frame, we could do 1040 runs of the image classification task.
Software Dependencies No To demonstrate the versatility of our methods, we optimize our neural network using Adam (Kingma & Ba, 2017) and run Fed Avg (Mc Mahan et al., 2017) instead of Fed SGD. Using the preprocessed training data, we fine-train our model to minimize the binary cross entropy loss (BCE).
Experiment Setup Yes Our architecture used for training involves the following layer sequence: A pre-trained BERT layer which outputs a 128-length embedding, a fully connected 128 neurons wide linear layer with Re LU activation, a dropout layer with a rate of 10 1 and finally, a linear layer classifying the text as hate speech or not. We consider the logit loss function. We use the following hyperparameters for training: learning rate: 10 3, training batch size of 40, and validation batch size of 20.