reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

MambaPro: Multi-Modal Object Re-identification with Mamba Aggregation and Synergistic Prompt

Authors: Yuhao Wang, Xuehu Liu, Tianyu Yan, Yang Liu, Aihua Zheng, Pingping Zhang, Huchuan Lu

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on three multi-modal object Re ID benchmarks (i.e., RGBNT201, RGBNT100 and MSVR310) validate the effectiveness of our proposed methods.
Researcher Affiliation	Academia	1School of Future Technology, School of Artificial Intelligence, Dalian University of Technology 2School of Computer Science and Artificial Intelligence, Wuhan University of Technology 3School of Artificial Intelligence, Anhui University 4Anhui Provincial Key Laboratory of Multimodal Cognitive Computation, Anhui University EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes the methodology using mathematical equations and textual explanations, but it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	Yes	The source code is available at https://github.com/924973292/Mamba Pro.
Open Datasets	Yes	To fully evaluate the performance of our method, we conduct experiments on three multi-modal object Re ID benchmarks. Specifically, RGBNT201 (Zheng et al. 2021) is a multi-modal person Re ID dataset comprising RGB, NIR and TIR images. RGBNT100 (Li et al. 2020) is a large-scale multi-modal vehicle Re ID dataset with diverse visual challenges, such as abnormal lighting, glaring and occlusion. MSVR310 (Zheng et al. 2022) is a small-scale multi-modal vehicle Re ID dataset with more challenges.
Dataset Splits	Yes	To fully evaluate the performance of our method, we conduct experiments on three multi-modal object Re ID benchmarks. Specifically, RGBNT201 (Zheng et al. 2021)... RGBNT100 (Li et al. 2020)... MSVR310 (Zheng et al. 2022)... For small-scale datasets (i.e., RGBNT201 and MSVR310), the mini-batch size is set to 64, with 4 images sampled for each identity and 16 identities sampled in a batch. For the large-scale dataset, i.e., RGBNT100, the mini-batch size is set to 128, with 16 images sampled for each identity.
Hardware Specification	Yes	Our model is implemented by using the Py Torch toolbox with one NVIDIA A100 GPU.
Software Dependencies	No	Our model is implemented by using the Py Torch toolbox with one NVIDIA A100 GPU. The paper mentions PyTorch but does not provide a specific version number, nor does it list other software dependencies with version numbers.
Experiment Setup	Yes	Implementation Details. Our model is implemented by using the Py Torch toolbox with one NVIDIA A100 GPU. We employ the pre-trained image encoder of CLIP (Radford et al. 2021) as the backbone. For the input resolution, images are resized to 256 128 for RGBNT201 and 128 256 for RGBNT100/MSVR310. For data augmentation, we employ random horizontal flipping, cropping and erasing (Zhong et al. 2020). For small-scale datasets (i.e., RGBNT201 and MSVR310), the mini-batch size is set to 64, with 4 images sampled for each identity and 16 identities sampled in a batch. For the large-scale dataset, i.e., RGBNT100, the mini-batch size is set to 128, with 16 images sampled for each identity. We set λ1 to 0.25 and λ2 to 1.0. We use the Adam optimizer to fine-tune the model with a learning rate of 3.5e 4. The warmup strategy with a cosine decay is used for learning rate scheduling. We set the total number of training epochs to 60 for RGBNT201/MVSR310 and 30 for RGBNT100, respectively.