Exploring Vacant Classes in Label-Skewed Federated Learning

Authors: Kuangpu Guo, Yuhe Ding, Jian Liang, Zilei Wang, Ran He, Tieniu Tan

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments validate the efficacy of Fed VLS, demonstrating superior performance compared to previous state-of-the-art (SOTA) methods across diverse datasets with varying degrees of label skews.
Researcher Affiliation Academia 1University of Science and Technology of China 2NLPR & MAIS, Institute of Automation, Chinese Academy of Sciences 3Anhui University 4University of Chinese Academy of Sciences 5Nanjing University EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode Yes Algorithm 1 in the technical appendix shows the overflow of our method.
Open Source Code Yes Code https://github.com/krumpguo/Fed VLS
Open Datasets Yes We evaluate the effectiveness of our approach across various image classification datasets, including MNIST (Deng 2012), CIFAR10 (Krizhevsky 2009), CIFAR100 (Krizhevsky 2009), and Tiny Image Net (Le and Yang 2015).
Dataset Splits No The paper states: "We partitioned each dataset into distinct training and test sets. Subsequently, the training set undergoes further division into non-overlapping subsets, distributed among different clients." However, it does not provide specific percentages or counts for the global training/test/validation splits for these datasets, nor does it cite a source for such standard splits.
Hardware Specification No The paper mentions the network architectures used (Mobile Net V2, DNN for MNIST) but does not provide any specific hardware details such as GPU models, CPU models, or memory.
Software Dependencies No The paper mentions using stochastic gradient descent (SGD) optimization but does not specify any software libraries or their version numbers (e.g., Python, PyTorch, TensorFlow, CUDA).
Experiment Setup Yes We set the number of clients N to 10 and implement full client participation. We run 100 communication rounds for all experiments on the CIFAR10/100 datasets and 50 communication rounds on the MNIST and Tiny Image Net datasets. Within each communication round, local training spans 5 epochs for MNIST and 10 epochs for the other datasets. For Fed Concat (Diao, Li, and He 2024) and Fed GF (Lee and Yoon 2024), we followed the original paper s settings for communication rounds and local epochs. We employ stochastic gradient descent (SGD) optimization with a learning rate of 0.01, a momentum of 0.9, and a batch size of 64. Weight decay is set to 10 5 for MNIST and CIFAR10 and 10 4 for CIFAR100 and Tiny Image Net. The hyperparameter λ of Fed VLS in Equation 6 is set to 0.1 for MNIST and CIFAR10, while it is set to 0.5 for CIFAR100 and Tiny Image Net.