reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Template-based Uncertainty Multimodal Fusion Network for RGBT Tracking

Authors: Zhaodong Ding, Chenglong Li, Shengqing Miao, Jin Tang

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments suggest that our method outperforms existing approaches on four RGBT tracking benchmarks. (Abstract) Extensive experiments demonstrate that our method outperforms existing RGBT tracking methods on four popular RGBT tracking datasets, including GTOT [Li et al., 2016], RGBT210 [Li et al., 2017], RGBT234 [Li et al., 2019a] and Las He R [Li et al., 2021]. (Section 1 - Contributions) 4 Experiment 4.4 Ablation Study
Researcher Affiliation	Academia	1National Key Laboratory of Opto-Electronic Information Acquisition and Protection Technology, Hefei, 230601, China 2Anhui Provincial Key Laboratory of Security Artificial Intelligence, Hefei, 230601, China 3Anhui Provincial Key Laboratory of Multimodal Cognitive Computation, Hefei, 230601, China 4School of Artificial Intelligence, Anhui University, Hefei, 230601, China 5School of Computer Science and Technology, Anhui University, Hefei, 230601, China 6School of Electronic and Information Engineering, Anhui University, Hefei, 230601, China zhaodongding EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes the methodology using text and mathematical equations, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	*https://github.com/dongdong2061/IJCAI25-TUMFNet (Conclusion section)
Open Datasets	Yes	Extensive experiments demonstrate that our method outperforms existing RGBT tracking methods on four popular RGBT tracking datasets, including GTOT [Li et al., 2016], RGBT210 [Li et al., 2017], RGBT234 [Li et al., 2019a] and Las He R [Li et al., 2021]. (Section 1 - Contributions) GTOT [Li et al., 2016] dataset is the earliest publicly available RGBT tracking dataset, containing a total of 50 sequences and approximately 15,000 frames. RGBT210 [Li et al., 2017] dataset consists of 210 pairs of RGBT video sequences, totaling 209.4K frames, and includes annotations across 12 attributes. RGBT234 [Li et al., 2019a] dataset extends RGBT210 dataset, offering more precise annotations. It contains a total of 234 pairs of RGBT video sequences, amounting to approximately 233.4K frames. Las He R [Li et al., 2021] is a large RGBT tracking dataset, consisting of 1,224 aligned video sequences with approximately 1,469.6K frames. It includes 245 test sequences and 979 training sequences, covering 19 real-world challenge attributes. (Section 4.2)
Dataset Splits	Yes	We train the overall tracking network end-to-end using the Las He R training set to evaluate GTOT, RGBT210, RGBT234, and Las He R test set. (Section 4.1) Las He R [Li et al., 2021] is a large RGBT tracking dataset, consisting of 1,224 aligned video sequences with approximately 1,469.6K frames. It includes 245 test sequences and 979 training sequences, covering 19 real-world challenge attributes. (Section 4.2)
Hardware Specification	Yes	Our model is implemented using Py Torch and experiments are conducted on one RTX 4090 GPU. (Section 4.1)
Software Dependencies	No	Our model is implemented using Py Torch and experiments are conducted on one RTX 4090 GPU. We take OSTrack [Ye et al., 2022] as the base tracker, which employs Vi T as the backbone network for feature extraction. (Section 4.1) The paper mentions PyTorch, OSTrack, and ViT, but does not provide specific version numbers for any of these software components.
Experiment Setup	Yes	The input search region and template sizes of the model are 256 256 and 128 128, respectively. The learning rate for the backbone network is set to 1 10 5, while the learning rate for the other parameters is set to 1 10 4. The model is trained for a total of 15 epochs. Additionally, we use the Adam W optimizer with a weight decay of 1 10 4. Note that our UMFM is inserted into the 10th, 11th, and 12th blocks of the Vi T. λ3 is set to 0.01 and N is set to 20. (Section 4.1)