Enhancing Foundation Models with Federated Domain Knowledge Infusion

Authors: Jiaqi Wang, Jingtao Li, Weiming Zhuang, Chen Chen, Lingjuan Lyu, Fenglong Ma

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments of the CLIP model on three domain-shifting datasets, Image CLEFDA, Office-Home, and Domain Net, demonstrate the superior performance of Fed AG in both indomain and out-of-domain scenarios.
Researcher Affiliation Collaboration 1The Pennsylvania State University 2Sony AI. Correspondence to: Lingjuan Lyu <EMAIL>, Fenglong Ma <EMAIL>.
Pseudocode Yes Algorithm 1 shows the pseudo-code of the proposed Fed AG model, which contains two main updates: the client update (lines 6-14) and the server update (lines 15 32).
Open Source Code Yes The source code can be found at https://github. com/Jackqq Wang/fedag.
Open Datasets Yes Datasets. To fairly validate the proposed model Fed AG in our experiments, we focus on the image classification task on three commonly used domain-shifting datasets: Domain Net, Office-Home, and Image CLEF-DA. More details can be found in Appendix. ... Domain Net3. It totally has 569,010 images from 6 domains, including clipart, infographics, painting, quickdraw, real, and sketch. Each domain contains 48K to 172K images, categorized into 345 classes. Office-Home4. It has 15,500 images from 4 different dimensions: artistic images, clip art, product images, and real-world images. Each domain has 65 object classes. Image CLEF-DA5. It is a benchmark for the Image CLEF 2014 domain adaption challenge, including Caltech-256, Image Net ILSVRC 2012, and Pascal VOC 2012. There are 12 categories and 50 images in each category. Footnotes 3, 4, 5 provide URLs to these datasets.
Dataset Splits Yes The number of synthetic data for each training domain equals 10% of the real domain data. ... We train the models using the domains shown in the table and conduct the testing with the remaining domain data. ... For all the scenarios, we always keep the Sketch domain for out-of-domain testing.
Hardware Specification Yes All experiments are conducted on an NVIDIA A6000 with CUDA version 12.0, running on a Ubuntu 20.04.6 LTS server.
Software Dependencies Yes All baselines and the proposed Fed AG are implemented using Py Torch 2.0.1.
Experiment Setup Yes Our experimental setup involves 10 communication rounds. For the local update, we set the local training epoch as 10, the local learning rate as 1e-4, the batch size as 32, γ = 0.9, and the optimizer used in the optimization as Adam. For the server update, we set λ = 0.1, δ = 1e-3, the epoch of quality-aware in-domain mutual learning as 3, and the epoch of adapter initialization as 5.