reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Cross-modulated Attention Transformer for RGBT Tracking

Authors: Yun Xiao, Jiacong Zhao, Andong Lu, Chenglong Li, Bing Yin, Yin Lin, Cong Liu

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on five public RGBT tracking benchmarks show the outstanding performance of the proposed CAFormer against state-of-the-art methods.
Researcher Affiliation	Collaboration	1 School of Artificial Intelligence, Anhui University, Hefei, China ... 3 i FLYTEK CO.LTD., Hefei, China
Pseudocode	No	The paper describes the method using mathematical formulations and descriptive text, but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Code https://github.com/opacity-black/CAFormer
Open Datasets	Yes	Experiments on five public RGBT tracking benchmarks... Our experiments are conducted on five public datasets: GTOT (Li et al. 2016), RGBT210 (Li et al. 2017), RGBT234 (Li et al. 2019a), Las He R (Li et al. 2021), and VTUAV (Pengyu et al. 2022).
Dataset Splits	Yes	We train our model for 10 epochs on the training set of Las He R (Li et al. 2021)... For GTOT (Li et al. 2016), RGBT210 (Li et al. 2017), and RGBT234 (Li et al. 2019a), we directly evaluate our model without any further fine-tuning. For VTUAV (Pengyu et al. 2022) dataset, we adopt the VTUAV training set for our training process, and adjust the number of training epochs to 5.
Hardware Specification	Yes	For the training process, CAFormer is trained on 2 NVIDIA 2080ti GPUs... Additionally, we complete the speed test on a device with an Nvidia RTX 3080ti GPU.
Software Dependencies	No	The paper mentions the use of 'Adam W (Loshchilov and Hutter 2017)' as the optimization algorithm, but does not specify versions for other key software components like programming languages or libraries.
Experiment Setup	Yes	In our method, the proposed CAFormer block is integrated into the last 3 layers of the backbone, and the CTE strategy is adopted at layers 3,6 and 9. The search regions are resized to 256 256, while the templates are resized to 128 128. For the training process, CAFormer is trained on 2 NVIDIA 2080ti GPUs with a global batch size of 32. We set the learning rates of the backbone network and other parameters to 5e-6 and 5e-5, respectively. The optimization algorithm employed is Adam W (Loshchilov and Hutter 2017) with a weight decay of 1e-4. We train our model for 10 epochs on the training set of Las He R... For VTUAV (Pengyu et al. 2022) dataset, we adopt the VTUAV training set for our training process, and adjust the number of training epochs to 5. Following previous work (Hui et al. 2023), all experiments in this paper are loaded with pre-trained weights from the public SOT method (Ye et al. 2022).