Wave-wise Discriminative Tracking by Phase-Amplitude Separation, Augmentation and Mixture

Authors: Huibin Tan, Mingyu Cao, Kun Hu, Xihuai He, Zhe Wang, Hao Li, Long Lan, Mengzhu Wang

IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on five benchmarks prove the effectiveness of our method.
Researcher Affiliation Academia Huibin Tan1 , Mingyu Cao1 , Kun Hu2 , Xihuai He1 , Zhe Wang3 , Hao Li1 , Long Lan1 and Mengzhu Wang4 1College of Computer Science and Technology, National University of Defense Technology 2Independent Researcher 3Hong Kong Polytechnic University 4Hebei University of Technology EMAIL, hu kun @outlook.com, EMAIL, EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode No The paper describes the methodology using prose and mathematical equations in sections 3.1, 3.2, 3.3, and 3.4, and provides architectural diagrams in Figure 2 and Figure 3, but does not include any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not contain any explicit statement about releasing the source code or a link to a code repository.
Open Datasets Yes The training datasets are COCO [Lin et al., 2014], La SOT [Fan et al., 2018], GOT-10k [Huang et al., 2018] and Tracking Net [M uller et al., 2018].
Dataset Splits Yes GOT-10k is consist of 10k sequences for training and 180 videos for testing.
Hardware Specification Yes We implement our model in Python using Py Torch and train it with 8 NVIDIA A100 GPUs. And the test are conducted on a single NVIDIA RTX3070 GPU.
Software Dependencies No The paper mentions "We implement our model in Python using Py Torch" but does not provide specific version numbers for these software components.
Experiment Setup Yes For WDT-Vi T, we set the batch size to 24, the weight decay to 10 4, the learning rate for the backbone to 4 10 5 and the rest parameters to 4 10 4, respectively. The learning rate decreases by a factor of 10 after 240 epochs. For WDT-Hi Vi T, we set the batch size to 4, the initial learning rate of the backbone network to 2 10 5, the learning rate of other parameters to 2 10 4, and the weight decay to 10 4. The total number of training epochs is 150, and the learning rate decreases by a factor of 10 after 120 epochs.