reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

SubjectDrive: Scaling Generative Data in Autonomous Driving via Subject Control

Authors: Binyuan Huang, Yuqing Wen, Yucheng Zhao, Yaosi Hu, Yingfei Liu, Fan Jia, Weixin Mao, Tiancai Wang, Chi Zhang, Chang Wen Chen, Zhenzhong Chen, Xiangyu Zhang

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive evaluations confirm Subject Drive s efficacy in generating scalable autonomous driving training data, marking a significant step toward revolutionizing data production methods in this field. Extensive experiments on the nu Scenes dataset (Caesar et al. 2020) validate the effectiveness of our proposed method. Table 1: Evaluation of data scaling on detection and tracking tasks. Table 5: Ablation studies of different modules in Subject Drive, with the last row showing the alignment performance on the real validation data.
Researcher Affiliation	Collaboration	Binyuan Huang1, Yuqing Wen2, Yucheng Zhao3, Yaosi Hu4, Yingfei Liu3, Fan Jia3, Weixin Mao3, Tiancai Wang3 , Chi Zhang5, Chang Wen Chen4, Zhenzhong Chen1, Xiangyu Zhang3 1Wuhan University 2University of Science and Technology of China 3MEGVII Technology 4The Hong Kong Polytechnic University 5Mach Drive
Pseudocode	No	The paper describes the methodology in narrative text and uses diagrams (Figure 3, 4, 5, 6) to illustrate the architecture and components, but does not include any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statements about releasing source code, nor does it provide links to a code repository.
Open Datasets	Yes	Extensive experiments on the nu Scenes dataset (Caesar et al. 2020) validate the effectiveness of our proposed method. We use the nu Scenes dataset to train Subject Drive and assess the visual fidelity and controllability of the generated data. The external subject bank is established by integrating external vehicle datasets from the open-source Comp Cars (Yang et al. 2015) dataset.
Dataset Splits	Yes	We use the nu Scenes dataset to train Subject Drive and assess the visual fidelity and controllability of the generated data. We generated the validation set of nu Scenes without applying any pre-processing or post-processing to the selected samples. The internal subject bank is curated by collecting subjects from the training set of the nu Scenes dataset.
Hardware Specification	Yes	Experiments are conducted on 8A100 GPUs using the DDIM sampler with 25 steps to produce 256 × 512 resolution video clips spanning 8 frames.
Software Dependencies	No	The paper mentions several models and frameworks like Panacea, CLIP, Control Net, Latent Diffusion Models, and Stream PETR with ResNet50, and the DDIM sampler, but does not provide specific version numbers for any software libraries or dependencies used for implementation.
Experiment Setup	Yes	Subject Drive adopts a two-stage video generation approach: image generation in the first stage (optimized for 56k steps) and video generation in the second (84k steps). Experiments are conducted on 8A100 GPUs using the DDIM sampler with 25 steps to produce 256 × 512 resolution video clips spanning 8 frames. The evaluation uses Stream PETR with a Res Net50 backbone (He et al. 2016), trained at 256 × 512 resolution.