reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Multimodal Image Matching Based on Cross-Modality Completion Pre-training

Authors: Meng Yang, Fan Fan, Jun Huang, Yong Ma, Xiaoguang Mei, Zhanchuan Cai, Jiayi Ma

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments demonstrate that XCP-Match outperforms existing algorithms on public datasets. Section 4 is dedicated to "Experiments" covering various evaluations and an ablation study.
Researcher Affiliation	Academia	1Wuhan University 2Macau University of Science and Technology EMAIL, EMAIL. All affiliations are universities, and email domains are academic (.edu.cn, .edu.mo).
Pseudocode	No	The paper describes the methodology in narrative text and uses figures to illustrate the architecture (e.g., Figure 1 and Figure 2). It does not contain any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any statement about releasing source code or a link to a code repository. The text is ambiguous and lacks a clear, affirmative statement of release.
Open Datasets	Yes	In pre-training, we use the KAIST Multispectral Pedestrian dataset [Hwang et al., 2015] for training. In fine-tuning, we use the Mega Depth dataset [Li and Snavely, 2018]. To evaluate the performance of XCP-Match for relative pose estimation in visible-infrared image pairs, we test it on the METU-Vis TIR dataset [Tuzcuo glu et al., 2024]. To test XCP-Match for homography estimation, we use the Road Scene dataset [Xu et al., 2020b; Xu et al., 2020a]. To test XCP-Match s performance for image registration, we evaluate it on the Tri Modal Human dataset [Palmero et al., 2016].
Dataset Splits	No	The paper mentions using datasets for training and testing (e.g., "KAIST Multispectral Pedestrian dataset [...] for training" and "Mega Depth dataset [...] for fine-tuning"), and it mentions selecting specific images for testing in one case ("We use RGB-FIR images and select only those with distinct human segmentations for testing"). However, it does not provide specific information on how the datasets are split into training, validation, and test sets (e.g., percentages, sample counts, or references to predefined splits with detailed methodology) to reproduce the data partitioning.
Hardware Specification	Yes	Pre-training is conducted using the Adam W optimizer with a learning rate of 2.5 10 4, a batch size of 2, a total of 15 epochs, and 30 hours of training on 2 NVIDIA Ge Force RTX 4090 GPUs. Fine-tuning is conducted [...] on 2 NVIDIA Ge Force RTX 4090 GPUs.
Software Dependencies	No	The paper mentions using the "Adam W optimizer" but does not specify any programming languages, libraries, or frameworks with version numbers that would be necessary to replicate the experiment.
Experiment Setup	Yes	Pre-training is conducted using the Adam W optimizer with a learning rate of 2.5 10 4, a batch size of 2, a total of 15 epochs, and 30 hours of training on 2 NVIDIA Ge Force RTX 4090 GPUs. Fine-tuning is conducted [...] with a learning rate of 2.5 10 4, a batch size of 2, a total of 25 epochs, and 125 hours of training on 2 NVIDIA Ge Force RTX 4090 GPUs. The thresholds in the matching network are set to: θc = 0.3, θf = 0.1. The settings in the loss function are set to: λc = 0.5, λf = 0.3, λsub = 104, λs = 1, λvis ac = λir ac = 0.25.