DiffScene: Diffusion-Based Safety-Critical Scenario Generation for Autonomous Vehicles

Authors: Chejian Xu, Aleksandr Petiushko, Ding Zhao, Bo Li

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimentation has been conducted to validate the efficacy of our approach. Compared with 6 SOTA baselines, Diff Scene generates scenarios that are (1) more safety-critical under different metrics, (2) more realistic under 5 distance functions, and (3) more transferable to different AV algorithms. In addition, we demonstrate that training AV algorithms with scenarios generated by Diff Scene leads to significantly higher performance under safety-critical metrics.
Researcher Affiliation Collaboration 1University of Illinois at Urbana-Champaign 2Gatik AI 3Carnegie Mellon University
Pseudocode Yes The detailed process of Diff Scene is shown in Algorithm 1 in Section 9.
Open Source Code No The paper mentions using Carla and GUAM simulators but does not provide any statement or link for the source code of the Diff Scene methodology itself.
Open Datasets No To train the diffusion model µω, we first construct a benign trajectory dataset in Carla by training several RL models from scratch in benign scenarios, collecting a total of 6,995 trajectories.
Dataset Splits Yes For training the safety-critical objective model Jϑ, we generate 5,000 trajectories per scenario setting using the trained diffusion model, calculating J (ω) as the ground truth. Each scenario setting uses 4,000 trajectories for training and 1,000 for testing.
Hardware Specification No The paper states, "We use Carla (Dosovitskiy et al. 2017; Xu et al. 2022) as our simulator," but does not specify any hardware (GPU, CPU models, memory, etc.) used for running the experiments or training the models.
Software Dependencies No The paper mentions using "Carla (Dosovitskiy et al. 2017; Xu et al. 2022) as our simulator" and "RL algorithms: SAC, PPO (Schulman et al. 2017), and TD3 (Fujimoto, Hoof, and Meger 2018)" but does not provide specific version numbers for these software components or libraries.
Experiment Setup Yes For the scenarios generated by each generation algorithm, we use 80% of them as the training set. The remaining 20% scenarios from all algorithms together form a standard test set. We finetune the target SAC model in the different training sets using 3 different random seeds, each for 500 episodes, and report the averaged testing result on the standard test set.