MMKE-Bench: A Multimodal Editing Benchmark for Diverse Visual Knowledge

Authors: yuntao du, Kailin Jiang, Zhi Gao, Chenrui Shi, Zilong Zheng, Siyuan Qi, Qing Li

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We assess five state-of-the-art knowledge editing methods on three prominent LMMs, revealing that no method excels across all criteria, and that visual and user-specific edits are particularly challenging. MMKEBench sets a new standard for evaluating the robustness of multimodal knowledge editing techniques, driving progress in this rapidly evolving field. Also, 'Extensive experiments with various baseline methods and LMMs in both single and sequential editing settings are conducted, revealing several limitations in existing knowledge editing approaches.'
Researcher Affiliation Academia 1State Key Laboratory of General Artificial Intelligence, BIGAI 2School of Software & Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University 3University of Science and Technology of China 4State Key Laboratory of General Artificial Intelligence, Peking University 5Beijing Key Laboratory of Intelligent Information Technology, School of Computer Science & Technology, Beijing Institute of Technology
Pseudocode No The paper describes the methods and processes using natural language descriptions and diagrams like Figure 2 for the construction pipeline, but does not present any formal pseudocode or algorithm blocks.
Open Source Code No The paper mentions using the 'VLKEB library' (https://github.com/VLKEB/VLKEB) for conducting experiments, but does not explicitly state that the source code for the MMKE-Bench construction methodology described in this paper is being released or provide a link to such code.
Open Datasets No The paper introduces MMKE-Bench, a comprehensive multimodal knowledge editing benchmark consisting of 2,940 pieces of knowledge and 8,363 images. While the paper describes the creation and statistics of this new benchmark, it does not provide a direct URL, DOI, or specific repository name for accessing the dataset.
Dataset Splits Yes The statistics of MMKE-Bench are shown in Tab.2. MMKE-Bench encompasses three classes of edited knowledge, totaling 2,940 knowledge pieces and 8,363 images. The knowledge spans 175 finegrained types, highlighting the diversity of MMKEBench. We split the dataset into training and validation sets at 4:6, with the training set reserved solely for specific knowledge editing methods (e.g., SERAC Mitchell et al. (2022b)).
Hardware Specification Yes The experiments are performed on NVIDIA A100/A800 80GB GPUs.
Software Dependencies No The paper mentions using 'Py Torch' and the 'VLKEB library' for experiments, but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup Yes Appendix B includes Tables 10, 11, and 12, which provide detailed hyper-parameters for knowledge editing methods and LMMs on visual entity editing, visual semantic editing, and user-specific editing. These tables specify settings such as Steps, Edit Layer, Optimizer, and Edit LR for models like BLIP2-OPT, Mini GPT-4, and LLa VA-1.5, and methods like FT-LLM, FT-Alignment, KE, SERAC, and MEND.