First-Order Federated Bilevel Learning
Authors: Yifan Yang, Peiyao Xiao, Shiqian Ma, Kaiyi Ji
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments, conducted on this application and federated hyper-representation, demonstrate the effectiveness of the proposed algorithm. [...] We first conduct an experiment on federated data cleaning to compare the performance of our proposed Mem FBO algorithms with multiple benchmark personalized federated learning algorithms [...] We then perform a federated hyper-representation experiment to compare the performance of our method with other FBO algorithms [...]. The former experiment tests the robustness of different methods on the CIFAR10 dataset with CNN backbones [...]. The latter experiment tests the communication and memory efficiency on MNIST with MLP backbones [...]. The results of test accuracy of different (personalized) federated learning methods under different noisy levels are shown in Table 1. |
| Researcher Affiliation | Academia | 1Department of Computer Science and Engineering, University at Buffalo 2Department of Computational Applied Math and Operations Research, Rice University EMAIL, EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1: Mem FBO |
| Open Source Code | No | The paper does not contain an explicit statement about the release of open-source code or a link to a code repository. |
| Open Datasets | Yes | The former experiment tests the robustness of different methods on the CIFAR10 dataset with CNN backbones, following the same experimental setup as in (Collins et al. 2021). The latter experiment tests the communication and memory efficiency on MNIST with MLP backbones, following the experiment setup in (Tarzanagh et al. 2022). |
| Dataset Splits | Yes | For the baselines, we use 100 clients with the CIFAR10 dataset split into 500 training and 100 test samples per client. To simulate noise, the 500 training samples are divided into 450 noisy samples and 50 clean samples. A subset of the 450 samples is corrupted based on a noise rate (proportion of corrupted data) and a flip rate. In this experiment, the noise rate is set as 0%, 30%, 50%, and 70%. The flip rate decides the portion of labels of a data sample, which are assigned to one of all labels randomly with equal probability. We fix the flip rate to be 80%. Finally, we set up the data heterogeneity following the same procedure as in Fed Rep (Collins et al. 2021), where each client is assigned with 2 classes. |
| Hardware Specification | No | The paper discusses 'edge devices like smartphones' in the context of the problem being solved, but does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments described in Section 6. |
| Software Dependencies | No | The paper mentions 'CNN backbones' and 'MLP backbones' but does not specify any software libraries or frameworks with version numbers (e.g., Python, PyTorch, TensorFlow versions) that were used for implementation or experimentation. |
| Experiment Setup | Yes | For the baselines, we use 100 clients with the CIFAR10 dataset split into 500 training and 100 test samples per client. To simulate noise, the 500 training samples are divided into 450 noisy samples and 50 clean samples. A subset of the 450 samples is corrupted based on a noise rate (proportion of corrupted data) and a flip rate. In this experiment, the noise rate is set as 0%, 30%, 50%, and 70%. The flip rate decides the portion of labels of a data sample, which are assigned to one of all labels randomly with equal probability. We fix the flip rate to be 80%. Finally, we set up the data heterogeneity following the same procedure as in Fed Rep (Collins et al. 2021), where each client is assigned with 2 classes. |