FedCFA: Alleviating Simpson’s Paradox in Model Aggregation with Counterfactual Federated Learning

Authors: Zhonghua Jiang, Jimin Xu, Shengyu Zhang, Tao Shen, Jiwei Li, Kun Kuang, Haibin Cai, Fei Wu

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments on six datasets and verify that our method outperforms other FL methods in terms of efficiency and global model accuracy under limited communication rounds.
Researcher Affiliation Academia 1Zhejiang University 2East China Normal University EMAIL, EMAIL
Pseudocode Yes The main steps of our proposed Fed CFA framework are shown in Algorithm 1, where T is the total communication rounds set for FL, and wt k is the model parameters of client k in the t-th round.
Open Source Code No The paper does not contain an explicit statement about the release of source code or a link to a code repository.
Open Datasets Yes Datasets: CIFAR10, CIFAR100 (Krizhevsky, Hinton et al. 2009), Tiny-Image Net, FEMNIST (Caldas et al. 2018), Sent140 (Go, Bhayani, and Huang 2009), MNIST. We built a dataset with Simpson s Paradox based on MNIST.
Dataset Splits Yes We use two different data partition methods: IID and Non-IID. IID Partition distributes samples uniformly to K clients through random sampling. We use IIDK to represent this data division. For Non-IID, we utilize Dirichlet distribution Dir K(α) to simulate the imbalance of dataset. The smaller the α, the greater the data difference between clients. We try several different client numbers and data partition methods: Dir60(0.6), Dir60(0.2), Dir100(0.2), Dir100(0.6), IID60, IID100. We use Dirichlet distribution to adjust the frequency of different categories labels in each client to simulate label distribution P(Y ) heterogeneity among clients. For FEMNIST, we divide different users into different clients to simulate feature distribution P(X) heterogeneity due to handwriting style variance. For binary classification text dataset Sent140, we divide it into different clients based on users and ensure consistent label distribution among clients, to simulate the heterogeneity of conditional feature distribution P(X|Y ).
Hardware Specification Yes We conduct experiments on a NVIDIA A100 with 40GB memory.
Software Dependencies No Using Fed Lab (Zeng et al. 2023), we build a typical FL scenario. No specific version numbers for Fed Lab or other software dependencies (e.g., Python, PyTorch) are provided.
Experiment Setup Yes Unless specified otherwise, we use MLP, Res Net18 and LSTM as network model, with 60 clients, learning rate β of 0.01, one local epoch, batch size of 128, and 500 communication rounds.