reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Exploring the Better Multimodal Synergy Strategy for Vision-Language Models

Authors: Xiaotian Yin, Xin Liu, Si Chen, Yuan Wang, Yuwen Pan, Tianzhu Zhang

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our extensive experiments demonstrate that Ds RA improves the generalizability under few-shot classification, base-to-new generalization, and domain generalization settings.
Researcher Affiliation	Academia	Deep Space Exploration Laboratory/School of Information Science and Technology, University of Science and Technology of China EMAIL, EMAIL
Pseudocode	No	The paper describes the proposed method using textual descriptions and mathematical equations (e.g., equations 1-17) but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	No	Our code will be released soon.
Open Datasets	Yes	In line with CLIP (Radford et al. 2021), we utilize 11 image benchmark datasets: Image Net (Img) (Deng et al. 2009), Caltech101 (Cal) (Fei-Fei, Fergus, and Perona 2004), FGVCAircraft (FGV) (Maji et al. 2013), Flower102 (Flo) (Nilsback and Zisserman 2008), Food101 (Foo) (Bossard, Guillaumin, and Van Gool 2014), Oxford Pets (Pet) (Parkhi et al. 2012), Stanford Cars (Car) (Krause et al. 2013), Euro SAT (Eur) (Helber et al. 2019), DTD (Cimpoi et al. 2014), SUN397 (SUN) (Xiao et al. 2010), and UCF101 (UCF) (Soomro, Zamir, and Shah 2012). ... Additionally, we conduct experiments to evaluate Ds RA s domain generalization capabilities, leveraging Image Net (Img) as the source dataset and considering its diverse domain variants, such as Image Net V2 (V2) (Recht et al. 2019), Image Net Sketch (S) (Wang et al. 2019), Image Net-A (A) (Hendrycks et al. 2021b), and Image Net R (R) (Hendrycks et al. 2021a), as the target datasets.
Dataset Splits	Yes	We first evaluate our model on few-shot classification, where models are trained on 1, 2, 4, 8 and 16 shots and then applied to the test sets. ... To ensure fairness, we follow the experimental methodologies outlined in Co Op (Zhou et al. 2022b) and Co Co Op (Zhou et al. 2022a), including dataset splits, data augmentation, and backbones. ... To test the base-to-new generalization ability, we follow Co Co Op to train our model only on the base classes in a 16-shot setting and evaluate the model on base and new categories.
Hardware Specification	Yes	All experiments are conducted on a single RTX 3090 GPU.
Software Dependencies	No	All experiments are conducted using the CLIP model with a Vi T-B/16 backbone. The hidden dimension dr is set to 50. SGD is used for optimization with a learning rate of 2.7e-3 and a batch size of 4.
Experiment Setup	Yes	The hidden dimension dr is set to 50. SGD is used for optimization with a learning rate of 2.7e-3 and a batch size of 4. All experiments are conducted on a single RTX 3090 GPU. The main results are averaged over three runs.