Tokenization, Fusion, and Augmentation: Towards Fine-grained Multi-modal Entity Representation

Authors: Yichi Zhang, Zhuo Chen, Lingbing Guo, Yajing Xu, Binbin Hu, Ziqi Liu, Wen Zhang, Huajun Chen

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on standard MMKGC benchmarks reveal that our method surpasses 19 of the latest models, underlining its superior performance. We conduct comprehensive experiments with public MMKG benchmarks (Liu et al. 2019a).
Researcher Affiliation Collaboration 1College of Computer Science and Technology, Zhejiang University 2ZJU-Ant Group Joint Lab of Knowledge Graph 3Ant Group 4School of Software Technology, Zhejiang University 5Zhejiang Key Laboratory of Big Data Intelligent Computing
Pseudocode No The paper describes methods through textual descriptions and diagrams (Figure 2 provides an overview of the framework) but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not contain an explicit statement about releasing source code or a link to a code repository for the methodology described.
Open Datasets Yes In this paper, we employ three public MMKGC benchmarks DB15K (Liu et al. 2019a), MKG-W and MKG-Y (Xu et al. 2022) to evaluate the model performance.
Dataset Splits No The paper mentions using
Hardware Specification No The paper states,
Software Dependencies No The paper mentions
Experiment Setup Yes During training, we set the training epoch to 2000, the batch size to 1024, and the embedding dimension to 256. The max token number m and n are tuned in {4, 8, 12} and the weight λ is tuned in {1, 0.1, 0.01, 0.001}. We optimize the model with Adam (Kingma and Ba 2015) optimizer.