reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

BMIP: Bi-directional Modality Interaction Prompt Learning for VLM

Authors: Song-Lin Lv, Yu-Yang Chen, Zhi Zhou, Ming Yang, Lan-Zhe Guo

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Comprehensive experiments on various datasets reveal that BMIP not only outperforms current state-of-the-art methods across all three evaluation paradigms but is also flexible enough to be combined with other prompt-based methods for consistent performance enhancement. Experimental results on 15 benchmarks demonstrated that the BMIP architecture achieves significant performance improvement compared to state-of-the-art (SOTA) prompt learning methods, especially in open-world generalization evaluation paradigm. We conduct comprehensive experiments on 15 benchmarks. The results demonstrated that BMIP achieves SOTA performance across all tasks, and is flexible enough to be combined with other prompt learning methods, consistently enhancing their performance.
Researcher Affiliation	Academia	1School of Intelligence Science and Technology, Nanjing University, China 2School of Artificial Intelligence, Nanjing University, China 3National Key Laboratory for Novel Software Technology, Nanjing University, China EMAIL
Pseudocode	No	The paper describes the proposed BMIP method and its components (Deep Language Prompt Learning, Deep Vision Prompt Learning, Vision Language Modality Interaction) conceptually with mathematical formulations, but it does not provide any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper states "The implementation details are in Appendix A.", but Appendix A is not provided, and there is no explicit statement or link to an open-source code repository for the methodology described.
Open Datasets	Yes	We assessed 11 different image classification datasets in the open-world generalization and cross-dataset transfer tasks, encompassing Image Net [Deng et al., 2009], Caltech101 [Fei-Fei et al., 2004], Oxford Pets [Parkhi et al., 2012], Stanford Cars [Krause et al., 2013], Flowers102 [Nilsback and Zisserman, 2008], Food101 [Bossard et al., 2014], FGVC-Aircraft [Maji et al., 2023], SUN397 [Xiao et al., 2010], UCF101 [Soomro et al., 2012], DTD [Cimpoi et al., 2014], and Euro SAT [Helber et al., 2019]. For domain generalization task, Image Net serves as our source dataset, while 4 variants Image Net V2 [Recht et al., 2019], Image Net Sketch [Wang et al., 2019], Image Net-A [Hendrycks et al., 2021b], and Image Net-R [Hendrycks et al., 2021a] serve as the target datasets.
Dataset Splits	No	The paper discusses evaluation paradigms such as "Generalization from base to novel classes task" and "open-world generalization" which involve base and new classes, but it does not specify concrete dataset split percentages (e.g., 80/10/10), absolute sample counts for each split, or references to predefined splits for reproduction.
Hardware Specification	No	The paper does not provide any specific details regarding the hardware (e.g., GPU models, CPU types, memory) used for running the experiments.
Software Dependencies	No	The paper mentions "The implementation details are in Appendix A.", but Appendix A is not provided, and there are no specific software names with version numbers listed in the main text.
Experiment Setup	No	The paper states "The implementation details are in Appendix A.", but Appendix A is not provided, and there are no specific experimental setup details such as hyperparameter values (learning rate, batch size, epochs), optimizer settings, or other training configurations in the main text.