reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

M^3EL: A Multi-task Multi-topic Dataset for Multi-modal Entity Linking

Authors: Fang Wang, Shenglin Yin, Xiaoying Bai, Minghao Hu, Tianwei Yan, Yi Liang

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that the existing models perform far below expectations (ACC of 49.4%-75.8%), After analysis, it was obtained that small dataset sizes, insufficient modality task coverage, and limited topic diversity resulted in poor generalization of multi-modal models. Our dataset effectively addresses these issues, and the CLIPND model fine-tuned with M 3EL shows a significant improvement in accuracy, with an average improvement of 9.3% to 25% across various tasks. [...] We conducted an ablation study, and the experimental results are presented in Figure 3.
Researcher Affiliation	Academia	1Peking University 2Advanced Institute of Big Data,Beijing 3National University of Defense Technology 4Xinjiang University EMAIL, EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes the data construction pipeline in Figure 1 but does not present it as structured pseudocode or an algorithm block.
Open Source Code	No	The footnote 1 links 'https://github.com/ww-ffff/M3EL' next to 'M 3EL1' when describing the data construction. This link pertains to the dataset, not explicitly the open-source code for the methodology, such as the CLIPND model or the augmented training strategy.
Open Datasets	Yes	To address these obstacles, we propose a dataset construction pipeline and publish M 3EL, a large-scale dataset for MEL. M 3EL includes 79,625 instances, covering 9 diverse multi-modal tasks, and 5 different topics. [...] Our dataset publicly available to facilitate future research. [...] 1https://github.com/ww-ffff/M3EL
Dataset Splits	Yes	The M 3EL dataset has been divided into training, validation, and test sets in the ratio of 8:1:1.
Hardware Specification	Yes	We trained on an A100 GPU with 40GB of memory with 35 epochs and a batch of 256.
Software Dependencies	No	The paper mentions employing the CLIP (Vi T-B-32) architecture and the AdamW optimizer but does not specify software versions for libraries like PyTorch or TensorFlow.
Experiment Setup	Yes	We trained on an A100 GPU with 40GB of memory with 35 epochs and a batch of 256. The Adam W optimizer was applied with an initial learning rate of 1e-5, betas=(0.9, 0.98), eps=1e-6, and weight decay=0.001.