Semi-Supervised Vision-Centric 3D Occupancy World Model for Autonomous Driving
Authors: Xiang Li, Pengfei Li, Yupeng Zheng, Wei Sun, Yan Wang, yilun chen
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on the nu Scenes dataset validate the effectiveness and scalability of our method, and demonstrate that Pre World achieves competitive performance across 3D occupancy prediction, 4D occupancy forecasting and motion planning tasks. |
| Researcher Affiliation | Academia | Xiang Li, Pengfei Li, Yupeng Zheng, Wei Sun, Yan Wang , Yilun Chen Institute for AI Industry Research (AIR), Tsinghua University EMAIL; EMAIL |
| Pseudocode | No | The paper describes methods and architectures but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor structured code-like procedures. |
| Open Source Code | Yes | 1Codes and models can be accessed at https://github.com/getterupper/Pre World. |
| Open Datasets | Yes | Our experiments are conducted on the Occ3D-nu Scenes benchmark (Tian et al., 2024), which provides dense semantic occupancy annotations for the widely used nu Scenes dataset (Caesar et al., 2020). |
| Dataset Splits | Yes | The official split for training and validation sets is employed. |
| Hardware Specification | Yes | All experiments are conducted on 8 NVIDIA A100 GPUs. |
| Software Dependencies | No | The paper mentions using Adam as the optimizer and references specific models like BEVStereo and FB-OCC, but it does not specify software dependencies with version numbers (e.g., Python version, PyTorch version, etc.). |
| Experiment Setup | Yes | For training, we set the batch size to 16, use Adam as the optimizer, and train with a learning rate of 1 10 4. All the hyperparameters λ in the loss functions have been set to 1.0. For 3D occupancy prediction task, Pre World undergoes 6 epochs in self-supervised pre-training stage and 12 epochs in fully-supervised fine-tuning stage. For 4D occupancy forecasting and motion planning task, Pre World undergoes 8 epochs in self-supervised pre-training stage and 18 epochs in fully-supervised fine-tuning stage. |