DriveTransformer: Unified Transformer for Scalable End-to-End Autonomous Driving
Authors: Xiaosong Jia, Junqi You, Zhiyuan Zhang, Junchi Yan
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Drive Transformer achieves state-of-the-art performance in both simulated closed-loop benchmark Bench2Drive and real world open-loop benchmark nu Scenes with high FPS. We compare Drive Transformer with SOTA E2E-AD methods in Table 1, Table 2, and Table 3. We observe that Drive Transformer persistently outperforms SOTA methods. In ablation studies, all closed-loop experiments are conducted on Dev10, a subset of Bench2Drive 220 routes, and all open-loop results are on Bench2Drive official validation set (50 clips). |
| Researcher Affiliation | Academia | Xiaosng Jia*, Junqi You*, Zhiyuan Zhang*, Junchi Yan Sch. of Computer Science & Sch. of Artificial Intelligence, Shanghai Jiao Tong University * Equal Contributions Correspondence Author. Correspondence author is also affiliated with Shanghai lnnovation Institute. |
| Pseudocode | No | The paper describes methods like 'Sensor Cross Attention (SCA)', 'Task Self-Attention (TSA)', and 'Temporal Cross Attention', and a 'Pure Attention Architecture' in sections 3.2 and 3.3. However, these descriptions are in paragraph form or mathematical equations (e.g., Eq 1, 2, 6) rather than structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Correspondence Author https://github.com/Thinklab-SJTU/Drive Transformer/ and We will open source our code and checkpoints. |
| Open Datasets | Yes | Drive Transformer achieves state-of-the-art closed-loop performance in Bench2Drive (Jia et al., 2024) under CARLA simulation and state-of-the-art open-loop planning performance on nu Scenes (Caesar et al., 2020b) dataset. We use Bench2Drive (Jia et al., 2024), a closed-loop evaluation protocol under CARLA Leaderboard 2.0 for end-to-end autonomous driving. ... Additionally, we compare our method with other state-of-the-art baselines on nu Scenes (Caesar et al., 2020a) open-loop evaluation. |
| Dataset Splits | Yes | We use Bench2Drive (Jia et al., 2024), a closed-loop evaluation protocol under CARLA Leaderboard 2.0 for end-to-end autonomous driving. It provides an official training set, where we use the base set (1000 clips) for fair comparison with all the other baselines. We use the official 220 routes for evaluation. Additionally, we compare our method with other state-of-the-art baselines on nu Scenes (Caesar et al., 2020a) open-loop evaluation. In ablation studies, all closed-loop experiments are conducted on Dev10, a subset of Bench2Drive 220 routes, and all open-loop results are on Bench2Drive official validation set (50 clips). Appendix B DEV10 BENCHMARK...For the 10 high-level types, we select one route for each with diverse weathers and towns. We give the details of Dev10 below: Table 9: Routes of Dev10 protocol. |
| Hardware Specification | Yes | All latency are measured by the averaged inference step-time on CARLA evaluation in A6000. Training batch size is measured by A800 (80G) to fill the GPU memory. All models are trained in Bench2Drive (Jia et al., 2024) base set (1000 clips) for 30 epochs on 8*A800 with a learning rate 1e-4... |
| Software Dependencies | No | We implement the model with Pytorch. ... We use Res Net50 as image backbones... Explanation: The paper mentions "Pytorch" but does not provide a specific version number for it or any other key software libraries or dependencies. |
| Experiment Setup | Yes | All models are trained in Bench2Drive (Jia et al., 2024) base set (1000 clips) for 30 epochs on 8*A800 with a learning rate 1e-4, weight decay 0.05, dropout 0.1, Adam W, and cosine annealing schedule. |