FLR: Label-Mixture Regularization for Federated Learning with Noisy Labels

Authors: Taehyeon Kim, Donggyu Kim, Se-Young Yun

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically find that FLR aligns with and advances existing FL and noisy label mitigation methods over multiple datasets under various levels of data heterogeneity and label noise.
Researcher Affiliation Collaboration Taehyeon Kim EMAIL KAIST AI Donggyu Kim EMAIL Medi Pixel Se-Young Yun EMAIL KAIST AI
Pseudocode Yes Algorithm 1: FLR input : Neural network M( ), server model parameter θserver, randomly initialized parameter θ0, client k s model parameter θk, global round T, local training epoch E, learning rate η, client k s dataset Dk = {(xki, yki)}nk i=1, balancing hyperparameters for FLR α, β, γ
Open Source Code No The paper does not provide an explicit statement about releasing code or a link to a code repository for the described methodology.
Open Datasets Yes We evaluate our methods on two standard benchmark datasets, CIFAR-10 and CIFAR-100 (Krizhevsky et al., 2009), and two real-world datasets, CIFAR10N (Wei et al., 2021) and Clothing1M (Xiao et al., 2015), with varying numbers of clients.
Dataset Splits Yes We conduct experiments with 100 clients for CIFAR-10 and CIFAR10N, 50 clients for CIFAR-100, and 500 clients for Clothing1M. ... # of train 50,000 # of test 10,000 ... non-iidness is parameterized by p and αDir. ... We then distribute the data samples of class c among the clients with Φ c = 1 using Latent Dirichlet Allocation (LDA)...
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU models, CPU models, or cloud instance types) used for running its experiments.
Software Dependencies No The paper mentions various methods and models like Fed Prox, GCE, Res Net-18, etc., but does not provide specific version numbers for software dependencies or libraries used in its implementation.
Experiment Setup Yes We set (λ, α, β, γ) typically to (2.0, 0.9, 0.7, 0.5) with Rw at 50, with further details provided in the Appendix. ... Learning rate 0.03 ... 0.01 ... 0.003 ... The selection of γ values from the 150th epoch... Warmup Fed Avg (phase1) ... during the first 50 epochs...