reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-Identification

Authors: Yuhao Wang, Yang Liu, Aihua Zheng, Pingping Zhang

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on three object Re ID benchmarks verify the effectiveness of our methods.
Researcher Affiliation	Academia	1School of Future Technology, School of Artificial Intelligence, Dalian University of Technology 2School of Artificial Intelligence, Anhui University 3Anhui Provincial Key Laboratory of Multimodal Cognitive Computation, Anhui University EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes the methodology in prose and uses diagrams (e.g., Figure 2) to illustrate the framework, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	The detailed configurations and results are available at https://github.com/924973292/De Mo.
Open Datasets	Yes	We evaluate the proposed method on three multi-modal object Re ID benchmarks. To be specific, RGBNT201 (Zheng et al. 2021) is a multi-modal person Re ID dataset, consisting of 4,787 aligned RGB, NIR and TIR images from 201 identities. RGBNT100 (Li et al. 2020) is a large-scale multi-modal vehicle Re ID dataset with 17,250 image triples, covering a wide range of challenging visual conditions. MSVR310 (Zheng et al. 2022) is a small-scale multi-modal vehicle Re ID dataset with 2,087 image triples, featuring high-quality images captured across diverse environments and time spans.
Dataset Splits	Yes	We evaluate the proposed method on three multi-modal object Re ID benchmarks. To be specific, RGBNT201 (Zheng et al. 2021) is a multi-modal person Re ID dataset, consisting of 4,787 aligned RGB, NIR and TIR images from 201 identities. RGBNT100 (Li et al. 2020) is a large-scale multi-modal vehicle Re ID dataset with 17,250 image triples, covering a wide range of challenging visual conditions. MSVR310 (Zheng et al. 2022) is a small-scale multi-modal vehicle Re ID dataset with 2,087 image triples, featuring high-quality images captured across diverse environments and time spans.
Hardware Specification	Yes	Our model is implemented using Py Torch with an NVIDIA A100 GPU.
Software Dependencies	No	Our model is implemented using Py Torch with an NVIDIA A100 GPU.
Experiment Setup	Yes	Images in triples are resized to 256 128 for RGBNT201 and 128 256 for RGBNT100/MSVR310. For data augmentation, we apply random horizontal flipping, cropping and erasing (Zhong et al. 2020). For RGBNT201 and MSVR310, the minibatch size is set to 64, sampling 8 images per identity. For RGBNT100, the mini-batch size is 128 with 16 images per identity. We fine-tune the proposed modules using the Adam optimizer with a learning rate of 3.5e 4 and a smaller learning rate of 5e 6 for the visual encoder. The total number of training epochs is 50. The number of experts nd is set to 7.