reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Achieving Speed-Accuracy Balance in Vision-based 3D Occupancy Prediction via Geometric-Semantic Disentanglement

Authors: Yulin He, Wei Chen, Siqi Wang, Tianci Xun, Yusong Tan

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our method achieves 39.4% m Io U at 20 FPS on Occ3D-nu Scenes, showcasing a state-of-the-art balance between accuracy and efﬁciency. We evaluate our model using the Occ3D-nu Scenes (Tian et al. 2023) benchmark, which is based on nu Scenes (Caesar et al. 2020) dataset and constructed for the CVPR2023 3D occupancy prediction challenge.
Researcher Affiliation	Academia	School of Computer, National University of Defense Technology, Changsha, China EMAIL
Pseudocode	No	The paper describes the methods in prose and through architectural diagrams (Figure 3 and Figure 4) but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Code https://github.com/harrylin-hyl/GSD-OCC
Open Datasets	Yes	We evaluate our model using the Occ3D-nu Scenes (Tian et al. 2023) benchmark, which is based on nu Scenes (Caesar et al. 2020) dataset and constructed for the CVPR2023 3D occupancy prediction challenge.
Dataset Splits	Yes	The dataset consists of 1000 videos, split into 700 for training, 150 for validation, and 150 for testing.
Hardware Specification	Yes	During training, we use a batch size of 32 on 8 Nvida A100 GPUs. ... During inference, we use a batch size of 1 on a single Nvidia A100 GPU. The FPS of all methods are evaluated on an Nvidia A100 GPU, except for Fast OCC, which is reported using an Nvidia V100 GPU in its paper.
Software Dependencies	No	The paper mentions using Res Net-50 as the image backbone, the Adam W optimizer, and the mmdetection3d codebase, but does not provide specific version numbers for these software components or other key libraries.
Experiment Setup	Yes	We maintain a memory queue of length 15 to store historical features. For RLK-3DConv, we set the size of convolution kernel to [11, 11, 1]. The steepness parameter r is set to 5 in geometric-semantic disentangled learning. During training, we use a batch size of 32 on 8 Nvida A100 GPUs. Unless otherwise speciﬁed, all models are trained for 24 epochs using the Adam W optimizer (Loshchilov, Hutter et al. 2017) with a learning rate 1 10 4 and a weight decay of 0.05.