Rewarding the Rare: Maverick-Aware Shapley Valuation in Federated Learning
Authors: Mengwei Yang, Baturalp Buyukates, Athina Markopoulou
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on benchmark datasets demonstrate that Fed MS improves model performance and better recognizes valuable client contributions, even under scenarios involving adversaries, free-riders, and skewed or rare-class distributions. |
| Researcher Affiliation | Academia | Mengwei Yang EMAIL Department of Electrical Engineering and Computer Science University of California, Irvine Baturalp Buyukates EMAIL School of Computer Science University of Birmingham Athina Markopoulou EMAIL Department of Electrical Engineering and Computer Science University of California, Irvine |
| Pseudocode | Yes | Algorithm 1: Maverick-Shapley GTG (MS-GTG) ... Algorithm 2: Fed MS: a Maverick-Shapley Client Selection Mechanism for FL ... Algorithm 3: Maverick-Shapley MR (MS-MR) ... Algorithm 4: Maverick-Shapley TMR (MS-TMR) |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-source code for the described methodology. |
| Open Datasets | Yes | Datasets. We use five benchmark datasets, (1) MNIST (Deng, 2012) consisting of handwritten digits, with 60,000 samples for training and 10,000 for testing, and (2) CIFAR-10 (Krizhevsky et al., 2009) consisting of colored images of 10 classes, with 50,000 samples for training and 10,000 for testing. Another 3 datasets are from Med MNIST (Yang et al., 2021; 2023), a large-scale MNIST-like collection of standardized biomedical images. (3) Blood MNIST (Acevedo et al., 2020) including blood cell microscope data, with 11,959 samples for training, 1,712 samples for validation and 3,421 for testing. (4) Organ AMNIST (Bilic et al., 2023; Xu et al., 2019) including abdominal CT data, with 34,561 samples for training, 6,491 samples for validation and 17,778 samples for testing. (5) Path MNIST (Kather et al., 2019) consisting of colon pathology data, with 89,996 samples for training, 10,004 samples for validation and 7,180 samples for testing. |
| Dataset Splits | Yes | For both MNIST and CIFAR-10, we follow common practice in existing Shapley-based FL methods (MR, TMR, and GTG) by using a balanced server-side validation set. For both MNIST and CIFAR-10, we randomly split 20% of testing samples as validation dataset. In the Blood MNIST, Organ AMNIST, and Path MNIST datasets, we utilize the provided validation set which follows the imbalanced distribution of the training samples. |
| Hardware Specification | Yes | Our algorithm is implemented in Pytorch and we perform experiments on two NVIDIA RTX A5000 and two Xeon Silver 4316. |
| Software Dependencies | No | The paper mentions 'Pytorch' but does not specify a version number or other key software dependencies with versions. |
| Experiment Setup | Yes | FL Setup. We train 100 rounds for MNIST and 200 for CIFAR-10, both with a batch size of 64. We employ 5 local training rounds for MNIST and a single local training round for CIFAR-10. We train Blood MNIST for 200 rounds and Organ AMNIST for 100 rounds. Path MNIST is trained for 100 rounds under the Mavericks + Dir(10) and Mavericks + Dir(1) settings, and 200 rounds under the Mavericks + Dir(0.1) setting. Under the Blood MNIST, Organ AMNIST, and Path MNIST datasets, we have one local training round and use a batch size of 64. The learning rate is 0.05 for all baselines in all datasets. Our algorithm is implemented in Pytorch and we perform experiments on two NVIDIA RTX A5000 and two Xeon Silver 4316. ... For our proposed Fed MS, we choose temperature T = 0.01, α = 0.8, and error threshold ϵb = 0.01, ϵi = 0.001. |