reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Adaptive Client Sampling in Federated Learning via Online Learning with Bandit Feedback

Authors: Boxin Zhao, Lingxiao Wang, Ziqi Liu, Zhiqiang Zhang, Jun Zhou, Chaochao Chen, Mladen Kolar

JMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through both simulated and real data experiments, we empirically illustrate the advantages of the proposed client sampling algorithm over uniform sampling and existing online learning-based sampling strategies. The proposed adaptive sampling procedure is applicable beyond the FL problem studied here and can be used to improve the performance of stochastic optimization procedures such as stochastic gradient descent and stochastic coordinate descent.
Researcher Affiliation	Collaboration	Boxin Zhao EMAIL University of Chicago Booth School of Business Chicago, IL, 60637, USA Lingxiao Wang EMAIL New Jersey Institute of Technology Department of Data Science Newark, NJ, 07106, USA Ziqi Liu EMAIL Zhiqiang Zhang EMAIL Jun Zhou EMAIL Ant Financial Group Hangzhou, Zhejiang, China Chaochao Chen EMAIL College of Computer Science and Technology Zhejiang University Hangzhou, Zhejiang, China Mladen Kolar EMAIL University of Southern California Marshall School of Business Los Angeles, CA, 90089, USA
Pseudocode	Yes	Algorithm 1 OSMD Sampler Algorithm 2 Solver of Step 7 of Algorithm 1 Algorithm 3 Mini-batch SGD with OSMD Sampler Algorithm 4 Fed Avg with OSMD Sampler Algorithm 5 Adaptive-OSMD Sampler Algorithm 6 Adaptive-OSMD Sampler with Doubling Trick (Adaptive-Doubling-OSMD) Algorithm 7 Adaptive sampling without replacement
Open Source Code	Yes	Code to replicate the results in this paper is available at https://github.com/boxinz17/FL-Client-Sampling.
Open Datasets	Yes	We use three commonly used computer vision data sets: MNIST (Le Cun and Cortes, 2010), KMINIST (Clanuwat et al., 2018), and FMINST (Xiao et al., 2017).
Dataset Splits	Yes	We set the number of devices to be M = 500. To better simulate the situation where our method brings signiﬁcant convergence speed improvement, we create a highly skewed sample size distribution of the training set among clients: 65% of clients have only one training sample, 20% of clients have 5 training samples, 10% of clients have 30 training samples, and 5% of clients have 100 training samples. This setting tries to illustrate a real-life situation where most of the data come from a small fraction of users, while most of the users have only a small number of samples. ... In addition, each client has 10 validation samples used to measure the prediction accuracy of the model over the training process.
Hardware Specification	No	The paper mentions 'University of Chicago Booth Mercury Computing Cluster' in the acknowledgments, but does not provide specific details about the hardware specifications (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper does not explicitly mention any specific software components with version numbers (e.g., Python, PyTorch, TensorFlow version) used to implement the methodology or run experiments.
Experiment Setup	Yes	We set the number of clients as M = 100, and each client has nm = 100 samples... The dimension d of the problem is set as d = 10... We use the stochastic gradient descent to make global updates. At each round t, we choose a subset of K = 5 clients... For each client m St, we choose a mini-batch of samples, Bt m, of size B = 10... The parameter w is updated as wt+1 = wt + µSGD i Btm (ym,i w , xm,i ) xm,i, where µSGD is the learning rate, set as µSGD = 0.1 in simulations. In all experiments, we set α in Adaptive-OSMD Sampler as α = 0.4. The tuning parameters for MABS, VRB and Avare are set as in their original papers. ... Learning rate in SGD is set to 0.075 for MNIST and KMNIST, and is set to 0.03 for FMNIST. The total number of communication rounds is to 1, 000. In each round of communication, we choose K = 10 clients to participate (2% of total number of clients). For a chosen client m, we compute its local mini-batch gradient with the batch size equal to min{5, nm}, where nm is the training sample size on the client m.