VLScene: Vision-Language Guidance Distillation for Camera-Based 3D Semantic Scene Completion

Authors: Meng Wang, Huilong Pi, Ruihui Li, Yunchuan Qin, Zhuo Tang, Kenli Li

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate that VLScene achieves rank-1st performance on challenging benchmarks Semantic KITTI and SSCBench KITTI-360, yielding remarkably m Io U scores of 17.52 and 19.10, respectively. To evaluate the performance of VLScene, we conduct thorough experiments using the large outdoor datasets Semantic KITTI (Behley et al. 2019) and SSCBench-KITTI360 (Li et al. 2023b).
Researcher Affiliation Academia College of Computer Science and Electronic Engineering, Hunan University, Changsha, China EMAIL
Pseudocode No The paper describes the methodology using text and mathematical equations, but it does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code https://github.com/willemeng/VLScene
Open Datasets Yes To evaluate the performance of VLScene, we conduct thorough experiments using the large outdoor datasets Semantic KITTI (Behley et al. 2019) and SSCBench-KITTI360 (Liao, Xie, and Geiger 2022; Li et al. 2023b).
Dataset Splits Yes Quantitative Results Table 1 presents a comparison of our VLScene with other state-of-the-art camera-based SSC methods on the Semantic KITTI hidden test set. As shown in Table 2, VLScene also exhibits a significant advantage in semantic and geometric analysis over current camera-based approaches on the rich data samples SSCBench-KITTI-360 benchmark. Furthermore, Table 3 shows that we provide different ranges of results on the Semantic KITTI validation set.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment.
Experiment Setup No The paper describes the training loss function and balancing coefficients (L = λssc Lssc + λkd Lkd), but does not provide specific values for hyperparameters such as learning rate, batch size, number of epochs, or optimizer settings. It states 'where several λ are balancing coefficients' without giving their concrete values.