Multi-Modal Object Re-identification via Sparse Mixture-of-Experts

Authors: Yingying Feng, Jie Li, Chi Xie, Lei Tan, Jiayi Ji

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on three challenging public datasets (RGBNT201, RGBNT100, and MSVR310) demonstrate the superiority of our approach in terms of both accuracy and efficiency, with 8.4% m AP and 6.9% accuracy improved in RGBNT201 with negligible additional parameters.
Researcher Affiliation Academia 1School of Computer Science and Engineering, Northeastern University, Shenyang, China. 2School of Informatics, Xiamen University, Xiamen, China. 3Tongji University, Shanghai, China. 4National University of Singapore, Singapore. Correspondence to: Lei Tan <EMAIL>.
Pseudocode No The paper describes the methodology using mathematical formulas and textual explanations of modules (FFM, FRM) but does not present a structured pseudocode block or algorithm.
Open Source Code Yes The code is available at https: //github.com/stone96123/MFRNet.
Open Datasets Yes We evaluate our model performance on three public multi-modal object Re ID datasets. Specifically, RGBNT201 (Zheng et al., 2021) is a multi-modal person Re ID dataset, which includes 4,787 aligned RGB, NIR, and TIR images from 201 identities. RGBNT100 (Li et al., 2020) is a large-scale multi-modal vehicle Re ID dataset comprising 17,250 image triples. MSVR310 (Zheng et al., 2022) is a small-scale multi-modal vehicle Re ID dataset that includes 2,087 high-quality image triples captured across diverse environments and time spans.
Dataset Splits No The paper mentions data augmentation techniques and mini-batch sizes, but it does not explicitly state the dataset splits (e.g., train/test/validation percentages or counts) used for the experiments, nor does it reference standard splits for the mentioned datasets.
Hardware Specification Yes Our model is implemented using the Py Torch toolbox, and experiments are conducted on an NVIDIA V100 GPU.
Software Dependencies No The paper mentions using 'Py Torch toolbox' but does not specify a version number for PyTorch or any other software dependencies.
Experiment Setup Yes The mini-batch size is set to 128 for RGBNT100, and 64 for RGBNT201 and MSVR310, with corresponding sampling strategies for each dataset. We employ the Adam optimizer with an initial learning rate of 3.5e 4 and the learning rate of the visual encoder is 5e 6. The total number of training epochs is set to 45 for RGBNT201 and RGBNT100, and 50 for MSVR310.