Federated Unlearning with Gradient Descent and Conflict Mitigation
Authors: Zibin Pan, Zhichao Wang, Chi Li, Kaiyan Zheng, Boqi Wang, Xiaoying Tang, Junhua Zhao
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, extensive experiments in several FL scenarios verify that Fed OSD outperforms the SOTA FU algorithms in terms of unlearning and model utility. |
| Researcher Affiliation | Academia | 1The School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, China 2The Shenzhen Institute of Artificial Intelligence and Robotics for Society 3The Guangdong Provincial Key Laboratory of Future Networks of Intelligence 4Department of Mathematics and Department of Statistics, University of Michigan 5Shenzhen Research Institute of Big Data EMAIL, EMAIL, EMAIL, EMAIL, EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1: Fed OSD |
| Open Source Code | Yes | Code https://github.com/zibinpan/Fed OSD |
| Open Datasets | Yes | We follow (Zhao et al. 2023) to evaluate the algorithm performance on the public datasets MNIST (Le Cun et al. 1998), FMNIST (Xiao, Rasul, and Vollgraf 2017), and CIFAR-10/100 (Krizhevsky and Hinton 2009), where the training/testing data have already been split. |
| Dataset Splits | Yes | We follow (Zhao et al. 2023) to evaluate the algorithm performance on the public datasets MNIST (Le Cun et al. 1998), FMNIST (Xiao, Rasul, and Vollgraf 2017), and CIFAR-10/100 (Krizhevsky and Hinton 2009), where the training/testing data have already been split. To evaluate the effectiveness of unlearning across varying heterogeneous local data distributions, we consider four scenarios to assign data for clients: (1) Pat-20: We follow (Mc Mahan et al. 2017) to build a pathological non-IID scenario where each client owns the data of 20% classes. For example, in a dataset like MNIST with 10 classes, each client has two classes of the data. (2) Pat-50: It constructs a scenario where each client has 50% classes. (3) Pat-10: It s an extreme data-island scenario where each client has 10% of distinct classes. (4) IID: The data are randomly and equally separated among all clients. |
| Hardware Specification | No | The paper does not explicitly mention specific hardware details such as GPU models, CPU models, or memory specifications used for running its experiments. |
| Software Dependencies | No | The paper mentions various algorithms and models (e.g., SGD, Le Net-5, MLP, CNN, NFRes Net18) but does not provide specific software library names with version numbers required to replicate the experiment. |
| Experiment Setup | Yes | We follow the settings of (Halimi et al. 2022; Zhang et al. 2023) that all clients utilize Stochastic Gradient Descent (SGD) on local datasets with local epoch E = 1. We set the batch size as 200 and the learning rate η {0.005, 0.025, 0.001, 0.0005} decay of 0.999 per round, where the best performance of each method is chosen in comparison. Prior to unlearning, we run Fed Avg (Mc Mahan et al. 2017) for 2000 communication rounds to generate the original model ω0 for unlearning. The max unlearning round is 100, while the max total communication round (including unlearning and post-training) is 200. |