reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

FATE: Feature-Adapted Parameter Tuning for Vision-Language Models

Authors: Zhengqin Xu, Zelin Peng, Xiaokang Yang, Wei Shen

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Comprehensive experiments on 11 datasets covering a diverse set of visual recognition tasks demonstrate that FATE shows leading performance. Additionally, FATE demonstrates remarkable acceleration compared with the current prompt engineering and PEFT methods.
Researcher Affiliation	Academia	Zhengqin Xu, Zelin Peng*, Xiaokang Yang , Wei Shen Mo E Key Lab of Artifcial Intelligence, AI Institute, Shanghai Jiao Tong University EMAIL
Pseudocode	No	The paper describes the methodology using mathematical formulations and descriptive text, but it does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide an explicit statement about the availability of open-source code for the methodology described, nor does it include any links to code repositories.
Open Datasets	Yes	We evaluate the generalizability of our proposed FATE on 11 image classification datasets, including 2 general object recognition datasets: Image Net (Deng et al. 2009) and Caltech101 (Fei-Fei, Fergus, and Perona 2004); 5 fine-grained image recognition datasets: Oxford Pets (Parkhi et al. 2012), Standford Cars (Krause et al. 2013), Flowers102 (Nilsback and Zisserman 2008), Food101 (Bossard, Guillaumin, and Van Gool 2014), and FGVCAircraft (Maji et al. 2013); a scene understanding dataset: SUN397 (Xiao et al. 2010); a texture dataset: DTD (Cimpoi et al. 2014); a satellite-image recognition dataset: Euro SAT (Helber et al. 2019)and an action classification dataset: UCF101 (Soomro, Zamir, and Shah 2012).
Dataset Splits	Yes	In line with previous works (Zhou et al. 2022b,a; Khattak et al. 2023), we use a few-shot setting that randomly samples 16 shots for each class in all experiments. The model is trained using only the base classes in a few-shot setting while evaluation is conducted on base and novel categories to test generalizability. Cross-dataset Evaluation. As suggested in Co Co Op (Zhou et al. 2022a), we also use the 11 datasets mentioned above for cross-dataset evaluation, in which all models are trained on Image Net with 1000 categories (each category having 16 training samples) and directly transfer the model to evaluate on other datasets.
Hardware Specification	Yes	All models are trained with a cosine learning rate schedule on a single NVIDIA 3090 GPU.
Software Dependencies	No	The paper does not provide specific version numbers for software dependencies such as programming languages, libraries, or frameworks used to implement the methodology.
Experiment Setup	Yes	In line with previous works (Zhou et al. 2022b,a; Khattak et al. 2023), we use a few-shot setting that randomly samples 16 shots for each class in all experiments. The pre-trained Vi T-B/16 CLIP model is used throughout the experiments. We train FATE for 10 epochs with a batch size of 10 and an initial learning rate of 0.002 via an SGD solver. All models are trained with a cosine learning rate schedule on a single NVIDIA 3090 GPU. To maintain robust results, we report the results of Base and Novel class accuracy, and their harmonic mean (HM) averaged over three times with different seeds. Table. 7 shows that with α = 0.001, FATE achieves the optimal trade-off performance.