reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

S2-Track: A Simple yet Strong Approach for End-to-End 3D Multi-Object Tracking

Authors: Tao Tang, Lijun Zhou, Pengkun Hao, Zihang He, Kalok Ho, Shuo Gu, Zhihui Hao, Haiyang Sun, Kun Zhan, Peng Jia, Xianpeng Lang, Xiaodan Liang

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on nu Scenes benchmark demonstrate the effectiveness of our S2-Track framework. It achieves state-of-the-art performance with an impressive 66.3% AMOTA on the test split, surpassing the previous best end-to-end solution by a significant margin of 8.9% AMOTA. These results highlight our simple yet non-trivial improvements and showcase the potential of our framework in advancing the field of autonomous driving perception.
Researcher Affiliation	Collaboration	*Equal contribution , Work done during an internship at Li Auto Inc. 1Shenzhen Campus of Sun Yat-sen University 2Li Auto Inc. Correspondence to: Xiaodan Liang <EMAIL>.
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks. Methodologies are described in text and supported by architectural diagrams.
Open Source Code	No	We will include these implementations in the revision.
Open Datasets	Yes	We conduct experiments on the popular nu Scenes benchmark (Caesar et al., 2020), which is a large-scale autonomous-driving dataset for 3D detection and tracking, consisting of 700, 150, and 150 scenes for training, validation, and testing, respectively.
Dataset Splits	Yes	We conduct experiments on the popular nu Scenes benchmark (Caesar et al., 2020), which is a large-scale autonomous-driving dataset for 3D detection and tracking, consisting of 700, 150, and 150 scenes for training, validation, and testing, respectively.
Hardware Specification	Yes	All experiments are conducted on 8 NVIDIA A100-80GB GPUs.
Software Dependencies	No	The paper mentions the AdamW optimizer but does not specify any software versions for libraries or programming languages (e.g., Python, PyTorch, CUDA versions).
Experiment Setup	Yes	We adopt the Adam W optimizer (Loshchilov & Hutter, 2017) for network training, with the initial learning rate setting of 0.01 and the cosine weight decay set to 0.001. By default, the thresholds βlower and βupper are set to 0.3 and 0.7, and the weight coefficients λ that are all set to 1.0, respectively. We pre-train the image backbone with single-frame detection task for 12 epochs (small-resolution setting) and 24 epochs (full-resolution setting) respectively, and further train the end-to-end tracker with consecutive frames (set to be 3 frames) for another 12 epochs (small-resolution) and 24 epochs (full-resolution).