M^3EL: A Multi-task Multi-topic Dataset for Multi-modal Entity Linking
Authors: Fang Wang, Shenglin Yin, Xiaoying Bai, Minghao Hu, Tianwei Yan, Yi Liang
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that the existing models perform far below expectations (ACC of 49.4%-75.8%), After analysis, it was obtained that small dataset sizes, insufficient modality task coverage, and limited topic diversity resulted in poor generalization of multi-modal models. Our dataset effectively addresses these issues, and the CLIPND model fine-tuned with M 3EL shows a significant improvement in accuracy, with an average improvement of 9.3% to 25% across various tasks. [...] We conducted an ablation study, and the experimental results are presented in Figure 3. |
| Researcher Affiliation | Academia | 1Peking University 2Advanced Institute of Big Data,Beijing 3National University of Defense Technology 4Xinjiang University EMAIL, EMAIL, EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes the data construction pipeline in Figure 1 but does not present it as structured pseudocode or an algorithm block. |
| Open Source Code | No | The footnote 1 links 'https://github.com/ww-ffff/M3EL' next to 'M 3EL1' when describing the data construction. This link pertains to the dataset, not explicitly the open-source code for the methodology, such as the CLIPND model or the augmented training strategy. |
| Open Datasets | Yes | To address these obstacles, we propose a dataset construction pipeline and publish M 3EL, a large-scale dataset for MEL. M 3EL includes 79,625 instances, covering 9 diverse multi-modal tasks, and 5 different topics. [...] Our dataset publicly available to facilitate future research. [...] 1https://github.com/ww-ffff/M3EL |
| Dataset Splits | Yes | The M 3EL dataset has been divided into training, validation, and test sets in the ratio of 8:1:1. |
| Hardware Specification | Yes | We trained on an A100 GPU with 40GB of memory with 35 epochs and a batch of 256. |
| Software Dependencies | No | The paper mentions employing the CLIP (Vi T-B-32) architecture and the AdamW optimizer but does not specify software versions for libraries like PyTorch or TensorFlow. |
| Experiment Setup | Yes | We trained on an A100 GPU with 40GB of memory with 35 epochs and a batch of 256. The Adam W optimizer was applied with an initial learning rate of 1e-5, betas=(0.9, 0.98), eps=1e-6, and weight decay=0.001. |