reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

AdaWM: Adaptive World Model based Planning for Autonomous Driving

Authors: Hang Wang, Xin Ye, Feng Tao, Chenbin Pan, Abhirup Mallik, Burhan Yaman, Liu Ren, Junshan Zhang

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on the challenging CARLA driving tasks demonstrate that Ada WM significantly improves the finetuning process, resulting in more robust and efficient performance in autonomous driving systems.
Researcher Affiliation	Collaboration	Hang Wang1,2 Xin Ye1 Feng Tao1 Chenbin Pan1 Abhirup Mallik1 Burhaneddin Yaman1 Liu Ren1 Junshan Zhang2 1Bosch Research North America & Bosch Center for Artificial Intelligence (BCAI) 2University of California, Davis
Pseudocode	Yes	Algorithm 1 Ada WM: Adaptive World Model based Planning Require: Pretrained dynamics model WMϕ(P) and policy πω (parameter Ω). Planning horizon K. Threshold C. Reply buffer B collected from pretraining phase. 1: for finetuning step t = 1, 2, do 2: Collect samples W = {(x, a, r)} by following current policy πt (parameter ωt) and dynamics model WMt (P) (parameter ϕt). 3: Mismatch Identification: Evaluate the policy mismatch by using samples from B and W to compute the TV distance as DTV(πt\|πω) maxx πt(a\|x) πω(a\|x) . Evaluate the mismatch of the dynamics model by DTV(P\| ˆP) P(x, a) ˆP(x, a) 1. 4: if DTV(P\| ˆP) > C DTV(πt\|πω) then 5: Update dynamics model B B, ϕt = (B Z) Φ. 6: else 7: Update policy , ωt = ( ) Ω. 8: end if 9: end for
Open Source Code	No	The paper does not provide an explicit statement about releasing their source code, nor does it include a link to a code repository.
Open Datasets	Yes	Experiments Environment. We conduct our experiments in CARLA, an open-source simulator with high-fidelity 3D environment Dosovitskiy et al. (2017). Training Dataset: Bench2Drive. In our experiments, we use the open source Bench2Drive dataset Jia et al. (2024); Li et al. (2024), which is a comprehensive benchmark designed to evaluate end-to-end autonomous driving (E2EAD) systems in a closed-loop manner.
Dataset Splits	Yes	Pretrain-Finetune. In our experiments, we use the tasks from CARLA leaderboard v2 and Bench2Drive Jia et al. (2024) dataset for pretraining. ... Following the pretraining, we evaluate the learning performance in four tasks, respectively: 1) Task ROM03: This task is a ROundabout in Moderate traffic in Town 03... 2) Task RTD12: This task features a Right Turn in Dense traffic in Town 12... 3) Task LTM03: This task involves a Left Turn in Moderate traffic in Town 03... 4) Task LTD03: The most challenging task as it involves a Left Turn in Dense traffic in Town 03.
Hardware Specification	Yes	The pretraining is conducted for 12 hour training on a single V100 GPU. After obtaining the pretrained model and policy, we conduct finetuning phase for one hour on a single V100 GPU.
Software Dependencies	No	The paper mentions using CARLA, Dreamer V3, but does not provide specific version numbers for these or any other software libraries or programming languages used.
Experiment Setup	Yes	Table 6: Dreamer v3 hyper parameters Hafner et al. (2023). Name Symbol Value Replay capacity (FIFO) 10^6 Batch size B 16 Batch length T 64 Activation Layer Norm + Si LU World Model Number of latents 32 Classes per latent 32 Reconstruction loss scale βpred 1.0 Dynamics loss scale βdyn 0.5 Representation loss scale βrep 0.1 Learning rate 10^-4 Adam epsilon ϵ 10^-8 Gradient clipping 1000 Actor Critic Imagination horizon H 15 Discount horizon 1/(1 γ) 333 Return lambda λ 0.95 Critic EMA decay 0.98 Critic EMA regularizer 1 Return normalization scale S Per(R, 95) Per(R, 5) Return normalization limit L 1 Return normalization decay 0.99 Actor entropy scale η 3 10^-4 Learning rate 3 10^-5 Adam epsilon ϵ 10^-5 Gradient clipping 100