reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Retrieval-Augmented Dynamic Prompt Tuning for Incomplete Multimodal Learning

Authors: Jian Lang, Zhangtao Cheng, Ting Zhong, Fan Zhou

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments conducted on three real-world datasets show that RAGPT consistently outperforms all competitive baselines in handling incomplete modality problems.
Researcher Affiliation	Academia	1University of Electronic Science and Technology of China, Chengdu, Sichuan, China 2Kash Institute of Electronics and Information Industry, Kashgar, Xinjiang, China EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes methods in prose and mathematical formulations, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	The code of our work and prompt-based baselines is available at https://github.com/Jian-Lang/RAGPT.
Open Datasets	Yes	MM-IMDb (Arevalo et al. 2017), primarily used for movie genre classification involving both image and text modalities. (2) Food101 (Wang et al. 2015), which focuses on image classification that incorporates both image and text. (3) Hate Memes (Kiela et al. 2020), aimed to identify hate speech in memes using image and text modalities.
Dataset Splits	Yes	Detailed statistics of datasets are presented in Table 2. The dataset splits are consistent with the original paper. Table 2: Statistics of three multimodal downstream datasets. Dataset # Image # Text # Train # Val # Test MM-IMDb 25,959 25,959 15,552 2,608 7,799 Hate Memes 10,000 10,000 8,500 500 1,500 Food101 90,688 90,688 67,972 22,716
Hardware Specification	Yes	All experiments are conducted with an NVIDIA RTX 3090 GPU.
Software Dependencies	No	The paper mentions using pre-trained Vi LT and the Adam W optimizer but does not specify version numbers for any software libraries or programming languages used.
Experiment Setup	Yes	The length l of context-aware prompts is set to 2, the number of retrieved instances K is chosen from {1, 3, 5, 7, 9}, and the prompt insertion layer b is set to 2. We utilize the Adam W optimizer (Loshchilov and Hutter 2017) with a learning rate of 1 10 3 and total 20 epochs for optimizing the parameters.