FairSMOE: Mitigating Multi-Attribute Fairness Problem with Sparse Mixture-of-Experts
Authors: Changdi Yang, Zheng Zhan, Ci Zhang, Yifan Gong, Yize Li, Zichong Meng, Jun Liu, Xuan Shen, Hao Tang, Geng Yuan, Pu Zhao, Xue Lin, Yanzhi Wang
IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrated our effectiveness. Taking a Dei T-Small as the backbone, Fair SMo E achieves 77.25% and 86.01% accuracy on the ISIC2019 and Celeb A dataset respectively with Multi-attribute Predictive Quality Disparity (PQD) score of 0.801 and 0.787, beating current state-of-the-art methods such as Muffin and Multi Fair. |
| Researcher Affiliation | Collaboration | 1Northeastern University 2Microsoft Research 3University of Georgia 4Peking University |
| Pseudocode | Yes | Algorithm 1 Fairness-Guided Routing (FGR) |
| Open Source Code | No | The paper does not contain an explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We evaluate our methods on the ISIC 2019 and Celeb A datasets, primarily for skin lesion analysis and facial attribute recognition tasks, respectively, with more details in Appendix. |
| Dataset Splits | Yes | We randomly separate ISIC2019 80:20 for training and test, and randomly select 5% of training set for validation. |
| Hardware Specification | Yes | 4 Nvidia A100s are used for training and testing. |
| Software Dependencies | No | The paper mentions 'Adam W' as an optimizer and 'Transformers' models but does not provide specific version numbers for software libraries, programming languages, or other dependencies. |
| Experiment Setup | Yes | We applied a batch size of 256 and data augmentation of Random Resized Crop for all methods on both datasets. Transformers are optimized with Adam W with weight decay of 1 10 4, initial learning rate (LR) of 5 10 4. Training epoch is set to 300 for ISIC2019 and 500 for Celeb A. We set ϖ as 0.6 in Equation (8) and ε as 0.1 in Ltotal. |