BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving

Authors: Tao Tang, Dafeng Wei, Zhengyu Jia, Tian Gao, Changwei Cai, Chengkai Hou, Peng Jia, Kun Zhan, Haiyang Sun, Fan JingChen, Yixing Zhao, Xiaodan Liang, Xianpeng Lang, Yang Wang

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on the multi-level datasets show that BEV-TSR achieves state-of-the-art performance, e.g., 85.78% and 87.66% top-1 accuracy on scene-to-text and text-to-scene retrieval respectively.
Researcher Affiliation Collaboration 1Shenzhen Campus of Sun Yat-sen University 2Li Auto Inc.
Pseudocode No The paper describes the methodology using text and equations but does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes To address these limitations, we have further constructed the nu Scenes-Retrieval dataset based on the nu Scenes dataset, and the toolkit codes are attached in the supplement materials and will be public.
Open Datasets Yes To this end, we establish a multi-level retrieval dataset, nu Scenes-Retrieval, based on the widely adopted nu Scenes dataset.
Dataset Splits No The paper describes the creation of the nu Scenes-Retrieval dataset, but it does not provide specific training, validation, or test split percentages or sample counts for the experiments.
Hardware Specification No The implementation details are provided in the supplementary material.
Software Dependencies No The implementation details are provided in the supplementary material.
Experiment Setup No The implementation details are provided in the supplementary material.