VimGeo: Efficient Cross-View Geo-Localization with Vision Mamba Architecture
Authors: Jinglin Huang, Maoqiang Wu, Peichun Li, Wen Wu, Rong Yu
IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on CVUSA, CVACT, and VIGOR datasets validate Vim Geo s effectiveness and competitiveness in cross-view geo-localization tasks, achieving the leading results among sequence modeling-based methods. |
| Researcher Affiliation | Collaboration | 1School of Automation, Guangdong University of Technology 2Gauss Riemann Technologies Co., Ltd., Guangzhou, GD 3Frontier Research Center, Peng Cheng Laboratory 4School of Electronic Science and Engineering, South China Normal University 5Department of Computer and Information Science, University of Macau |
| Pseudocode | No | The paper describes the methodology and model architecture with text and figures (e.g., Figure 2 for model architecture and CGP, Figure 3 for DWBL process), but it does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The implementation is available at: https://github.com/Vim Geo Team/Vim Geo. |
| Open Datasets | Yes | Our method is evaluated on three widely-used cross-view geo-localization datasets: CVUSA [Zhai et al., 2017], CVACT [Liu and Li, 2019], and VIGOR [Zhu et al., 2021]. |
| Dataset Splits | Yes | CVUSA consists of 35,532 training pairs and 8,884 testing pairs, primarily featuring suburban landscapes. CVACT includes 35,532 training pairs, 8,884 validation pairs (CVACT val), and 92,802 testing pairs (CVACT test), focusing on urban regions in Canberra for city-scale geo-localization. |
| Hardware Specification | Yes | The model is trained using 4 NVIDIA A6000 GPUs. |
| Software Dependencies | No | The paper mentions using the AdamW optimizer and refers to the Vision Mamba (Vim) architecture, but does not provide specific version numbers for any software libraries (e.g., Python, PyTorch, TensorFlow, CUDA) used in the implementation. |
| Experiment Setup | Yes | We utilize the Adam W optimizer [Loshchilov, 2017], with the hyperparameter α in the single-direction loss function (Equation 7) set to 10, ensuring optimal convergence. |