Rethinking the Starting Point: Collaborative Pre-Training for Federated Downstream Tasks

Authors: Yun-Wei Chu, Dong-Jun Han, Seyyedali Hosseinalipour, Christopher G. Brinton

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that Co Pre FL significantly enhances average accuracy and reduces variance in arbitrary downstream FL tasks with unseen/seen labels, outperforming various pre-training baselines. Additionally, Co Pre FL proves compatible with different well-known FL algorithms used in downstream tasks, boosting performance in each case. (Abstract) ... 4 Experiments
Researcher Affiliation Academia 1Purdue University 2Yonsei University 3University at Buffalo-SUNY EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode Yes Algorithm 1: Our Pre-training Method Co Pre FL (Pretraining Phase in Scenario I)
Open Source Code No The paper does not provide an explicit statement about releasing its source code for the methodology described, nor does it provide a link to a code repository. The link provided in the footnote is to the arXiv paper itself: https://arxiv.org/abs/2402.02225
Open Datasets Yes For evaluation, we use CIFAR100 (Krizhevsky 2009), Tiny-Image Net (Le and Yang 2015), FEMNIST (Caldas et al. 2018), and PACS (Li et al. 2017), following data splits provided in (Park et al. 2021), and adopt Res Net-18 (He et al. 2015).
Dataset Splits Yes We adopt a standard approach commonly used in meta-learningbased research (Jamal et al. 2020; Shu et al. 2019; Park et al. 2021) for support/query splitting, where we randomly partition each client s data into 80% support and 20% query sets. ... Each participant in the downstream phase utilizes 80% of its local data as training samples, while the remaining 20% is reserved for testing samples. ... Data samples are distributed to |M| = 100 clients for pre-training and |G| = 10 clients for downstream FL tasks using the corresponding dataset based on a Dirichlet(α) distribution with α = 0.5
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as GPU or CPU models.
Software Dependencies No The paper mentions using ResNet-18 as the model architecture but does not specify any software libraries or frameworks (e.g., PyTorch, TensorFlow) with their version numbers.
Experiment Setup No The paper states, 'We keep the training procedure consistent for each downstream task (see Appendix C.2 for detailed settings),' indicating that specific experimental setup details and hyperparameter values are deferred to the appendix and are not present in the main text.