reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Enhancing Foundation Models with Federated Domain Knowledge Infusion

Authors: Jiaqi Wang, Jingtao Li, Weiming Zhuang, Chen Chen, Lingjuan Lyu, Fenglong Ma

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments of the CLIP model on three domain-shifting datasets, Image CLEFDA, Office-Home, and Domain Net, demonstrate the superior performance of Fed AG in both indomain and out-of-domain scenarios.
Researcher Affiliation	Collaboration	1The Pennsylvania State University 2Sony AI. Correspondence to: Lingjuan Lyu <EMAIL>, Fenglong Ma <EMAIL>.
Pseudocode	Yes	Algorithm 1 shows the pseudo-code of the proposed Fed AG model, which contains two main updates: the client update (lines 6-14) and the server update (lines 15 32).
Open Source Code	Yes	The source code can be found at https://github. com/Jackqq Wang/fedag.
Open Datasets	Yes	Datasets. To fairly validate the proposed model Fed AG in our experiments, we focus on the image classification task on three commonly used domain-shifting datasets: Domain Net, Office-Home, and Image CLEF-DA. More details can be found in Appendix. ... Domain Net3. It totally has 569,010 images from 6 domains, including clipart, infographics, painting, quickdraw, real, and sketch. Each domain contains 48K to 172K images, categorized into 345 classes. Office-Home4. It has 15,500 images from 4 different dimensions: artistic images, clip art, product images, and real-world images. Each domain has 65 object classes. Image CLEF-DA5. It is a benchmark for the Image CLEF 2014 domain adaption challenge, including Caltech-256, Image Net ILSVRC 2012, and Pascal VOC 2012. There are 12 categories and 50 images in each category. Footnotes 3, 4, 5 provide URLs to these datasets.
Dataset Splits	Yes	The number of synthetic data for each training domain equals 10% of the real domain data. ... We train the models using the domains shown in the table and conduct the testing with the remaining domain data. ... For all the scenarios, we always keep the Sketch domain for out-of-domain testing.
Hardware Specification	Yes	All experiments are conducted on an NVIDIA A6000 with CUDA version 12.0, running on a Ubuntu 20.04.6 LTS server.
Software Dependencies	Yes	All baselines and the proposed Fed AG are implemented using Py Torch 2.0.1.
Experiment Setup	Yes	Our experimental setup involves 10 communication rounds. For the local update, we set the local training epoch as 10, the local learning rate as 1e-4, the batch size as 32, γ = 0.9, and the optimizer used in the optimization as Adam. For the server update, we set λ = 0.1, δ = 1e-3, the epoch of quality-aware in-domain mutual learning as 3, and the epoch of adapter initialization as 5.