reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

DriveTransformer: Unified Transformer for Scalable End-to-End Autonomous Driving

Authors: Xiaosong Jia, Junqi You, Zhiyuan Zhang, Junchi Yan

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Drive Transformer achieves state-of-the-art performance in both simulated closed-loop benchmark Bench2Drive and real world open-loop benchmark nu Scenes with high FPS. We compare Drive Transformer with SOTA E2E-AD methods in Table 1, Table 2, and Table 3. We observe that Drive Transformer persistently outperforms SOTA methods. In ablation studies, all closed-loop experiments are conducted on Dev10, a subset of Bench2Drive 220 routes, and all open-loop results are on Bench2Drive official validation set (50 clips).
Researcher Affiliation	Academia	Xiaosng Jia, Junqi You, Zhiyuan Zhang, Junchi Yan Sch. of Computer Science & Sch. of Artificial Intelligence, Shanghai Jiao Tong University Equal Contributions Correspondence Author. Correspondence author is also affiliated with Shanghai lnnovation Institute.
Pseudocode	No	The paper describes methods like 'Sensor Cross Attention (SCA)', 'Task Self-Attention (TSA)', and 'Temporal Cross Attention', and a 'Pure Attention Architecture' in sections 3.2 and 3.3. However, these descriptions are in paragraph form or mathematical equations (e.g., Eq 1, 2, 6) rather than structured pseudocode or algorithm blocks.
Open Source Code	Yes	Correspondence Author https://github.com/Thinklab-SJTU/Drive Transformer/ and We will open source our code and checkpoints.
Open Datasets	Yes	Drive Transformer achieves state-of-the-art closed-loop performance in Bench2Drive (Jia et al., 2024) under CARLA simulation and state-of-the-art open-loop planning performance on nu Scenes (Caesar et al., 2020b) dataset. We use Bench2Drive (Jia et al., 2024), a closed-loop evaluation protocol under CARLA Leaderboard 2.0 for end-to-end autonomous driving. ... Additionally, we compare our method with other state-of-the-art baselines on nu Scenes (Caesar et al., 2020a) open-loop evaluation.
Dataset Splits	Yes	We use Bench2Drive (Jia et al., 2024), a closed-loop evaluation protocol under CARLA Leaderboard 2.0 for end-to-end autonomous driving. It provides an official training set, where we use the base set (1000 clips) for fair comparison with all the other baselines. We use the official 220 routes for evaluation. Additionally, we compare our method with other state-of-the-art baselines on nu Scenes (Caesar et al., 2020a) open-loop evaluation. In ablation studies, all closed-loop experiments are conducted on Dev10, a subset of Bench2Drive 220 routes, and all open-loop results are on Bench2Drive official validation set (50 clips). Appendix B DEV10 BENCHMARK...For the 10 high-level types, we select one route for each with diverse weathers and towns. We give the details of Dev10 below: Table 9: Routes of Dev10 protocol.
Hardware Specification	Yes	All latency are measured by the averaged inference step-time on CARLA evaluation in A6000. Training batch size is measured by A800 (80G) to fill the GPU memory. All models are trained in Bench2Drive (Jia et al., 2024) base set (1000 clips) for 30 epochs on 8*A800 with a learning rate 1e-4...
Software Dependencies	No	We implement the model with Pytorch. ... We use Res Net50 as image backbones... Explanation: The paper mentions "Pytorch" but does not provide a specific version number for it or any other key software libraries or dependencies.
Experiment Setup	Yes	All models are trained in Bench2Drive (Jia et al., 2024) base set (1000 clips) for 30 epochs on 8*A800 with a learning rate 1e-4, weight decay 0.05, dropout 0.1, Adam W, and cosine annealing schedule.