Leveraging Driver Field-of-View for Multimodal Ego-Trajectory Prediction

Authors: M. Eren Akbiyik, Nedko Savov, Danda Pani Paudel, Nikola Popovic, Christian Vater, Otmar Hilliges, Luc Van Gool, Xi Wang

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive evaluations on GEM and DR(eye)VE demonstrate that Route Former significantly outperforms state-of-the-art methods, achieving notable improvements in prediction accuracy across diverse conditions. Ablation studies reveal that incorporating driver field-of-view data yields significantly better average displacement error, especially in challenging scenarios with high PCI scores, underscoring the importance of modeling driver attention.
Researcher Affiliation Academia 1 ETH Z urich 2 INSAIT, Sofia University St. Kliment Ohridski 3 University of Bern 4 TU Munich
Pseudocode No The paper describes the RouteFormer architecture and its components in detail using text and diagrams (Figure 1, Figure 2) and mathematical formulations for loss functions, but does not include explicit pseudocode or algorithm blocks.
Open Source Code Yes All data and code is available at meakbiyik.github.io/routeformer.
Open Datasets Yes To tackle data scarcity and enhance diversity, we introduce GEM, a comprehensive dataset of urban driving scenarios enriched with synchronized driver field-of-view and gaze data. All data and code is available at meakbiyik.github.io/routeformer. We also evaluate Route Former on the DR(eye)VE dataset (Palazzi et al., 2018).
Dataset Splits Yes We choose different drivers for training and evaluation according to the data splits of either dataset (see Appendix D) to better demonstrate the benefit of FOV. Table 13: Breakdown of our GEM dataset. We report duration, weather condition, and PCI for each participant. ... Split: train, val., test
Hardware Specification No All experiments are carried out on a machine with an NVIDIA GPU with > 10 GB memory, 64 GB of RAM, and an Intel CPU.
Software Dependencies Yes The framework is implemented using Py Torch 2.0.
Experiment Setup Yes Route Former is trained using the Adam W optimizer with a linear warm-up of 2 epochs and cosine annealing, over a total of 200 epochs. Maximum learning rate of 1 10 5 and weight decay of 1 10 4 are used with batch size 16. A full set of hyperparameters can be found in Appendix B.2.