Real-Time Motion Prediction via Heterogeneous Polyline Transformer with Relative Pose Encoding

Authors: Zhejun Zhang, Alexander Liniger, Christos Sakaridis, Fisher Yu, Luc V Gool

NeurIPS 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on Waymo and Argoverse-2 datasets show that HPTR achieves superior performance among end-to-end methods that do not apply expensive post-processing or model ensembling.
Researcher Affiliation Collaboration Zhejun Zhang Computer Vision Lab ETH Zurich Zurich, Switzerland EMAIL Alexander Liniger Computer Vision Lab ETH Zurich Zurich, Switzerland EMAIL Christos Sakaridis Computer Vision Lab ETH Zurich Zurich, Switzerland EMAIL Fisher Yu Computer Vision Lab ETH Zurich Zurich, Switzerland EMAIL Luc Van Gool CVL, ETH Zurich, CH PSI, KU Leuven, BE INSAIT, Un. Sofia, BU EMAIL. This work is funded by Toyota Motor Europe via the research project TRACE-Zürich.
Pseudocode No Figure 7 in the appendix illustrates the implementation of KNARPE with matrix operations, but it is a diagram, not pseudocode in a structured, step-by-step textual format.
Open Source Code Yes The code is available at https://github.com/zhejz/HPTR.
Open Datasets Yes We benchmark our method on the two most popular datasets: the Waymo Open Motion Dataset (WOMD) [16] and the Argoverse-2 motion forecasting dataset (AV2) [60].
Dataset Splits Yes For WOMD, we randomly sample 25% from all training episodes at each epoch; for AV2 we use 50%.
Hardware Specification Yes We train with a total batch size of 12 episodes on 4 RTX 2080Ti GPUs.
Software Dependencies No We use standard Ubuntu, Python and Pytorch without optimizing for real-time deployment. (No version numbers provided for these software dependencies).
Experiment Setup Yes We use Adam W optimizer with an initial learning rate of 1e-4 and decaying by 0.5 every 25 epochs. We train with a total batch size of 12 episodes... Our final models are trained for 120 epochs for WOMD and 150 epochs for AV2. The base number of neighbors considered by KNARPE is K = 36. This number is multiplied by γTL = 2, γAG = 4 and γAC = 10 respectively for the enhance-TL, enhance-AG and AC-to-all Transformer. We set NAC = 6 to predict exactly 6 futures as specified by the leaderboard.