Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
FlexiReID: Adaptive Mixture of Expert for Multi-Modal Person Re-Identification
Authors: Zhen Sun, Lei Tan, Yunhang Shen, Chengmao Cai, Xing Sun, Pingyang Dai, Liujuan Cao, Rongrong Ji
ICML 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate that Flexi Re ID achieves state-of-the-art performance and offers strong generalization in complex scenarios. To facilitate comprehensive evaluation, we construct CIRSPEDES, a unified dataset extending four popular Re-ID datasets to include all four modalities. Experimental results on the CIRS-PEDES show that the proposed Flexi Re ID outperforms several other state-of-the-art methods, demonstrating its flexibility and effectiveness in complex scenes. |
| Researcher Affiliation | Collaboration | 1Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, 361005, P.R. China. 2National University of Singapore. 3Tencent You Tu Lab. 4Institute of Artificial Intelligence,Xiamen University,Xiamen,China. Correspondence to: Pingyang Dai <EMAIL>. |
| Pseudocode | No | The paper describes the methodology using mathematical equations and textual explanations, but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing the source code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | To facilitate comprehensive evaluation, we construct CIRSPEDES, a unified dataset extending four popular Re-ID datasets to include all four modalities. We introduced four datasets: CUHK-PEDES, ICFG-PEDS, RSTPReid, and SYSU-MM01. An overview of training and test set partitioning for each dataset can be found in the existing work(Ding et al., 2021; Zhu et al., 2021; Wu et al., 2020). |
| Dataset Splits | Yes | An overview of training and test set partitioning for each dataset can be found in the existing work(Ding et al., 2021; Zhu et al., 2021; Wu et al., 2020). Following existing cross-modality Re ID settings(Chen et al., 2022a; Ye et al., 2021b;c), we use the Rank-k matching accuracy, mean Average Precision (m AP), and mean Inverse Negative Penalty (m INP)(Ye et al., 2021c) metrics for performance evaluation in our Flexi Re ID. |
| Hardware Specification | Yes | We perform experiments on a single NVIDIA 3090 24GB GPU. |
| Software Dependencies | No | We employ the Vision Transformer(Dosovitskiy, 2020) as the visual modalities feature learning backbone, and the Transformer model(Vaswani, 2017) as the textual modality feature learning backbone. Both backbones have pre-trained parameters derived from CLIP(Radford et al., 2021). During the training process, all parameters of the backbone networks are frozen. ... We train our Flexi Re ID model with the Adam optimizer for 60 epochs. |
| Experiment Setup | Yes | In a batch, we randomly select 64 identities, each containing a sketch, an infrared, a text, and an RGB sample. The image is resized to 384 x 128, and the length of textual token sequence is 77. We train our Flexi Re ID model with the Adam optimizer for 60 epochs. And the initial learning rate is computed as 1e-5 and decayed by a cosine schedule. The threshold confidence level is set to 0.6, and the number of experts is 6. The hyperparameter λ that indicates the adaptive loss is set to 0.5. |