Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]

3D StreetUnveiler with Semantic-aware 2DGS - a simple baseline

Authors: Jingwei Xu, Yikai Wang, Yiqun Zhao, Yanwei Fu, Shenghua Gao

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments conducted on the street scene dataset successfully reconstructed a 3D representation of the empty street. The mesh representation of the empty street can be extracted for further applications. Our experiments were conducted on a single NVIDIA A40 GPU with peak memory usage of 16GB. The quantitative comparison results are shown in Tab. 1, and the qualitative comparison of 3D inpainting methods are shown in Fig. 5. Ablation of different inpainting methods as pseudo labels. We compare the reconstruction results with pseudo labels from different inpainting methods. From Fig. 6, we can observe that time reversal will maintain the consistency between View 1 and View 2.
Researcher Affiliation Collaboration Jingwei Xu1, Yikai Wang2 , Yiqun Zhao3,6, Yanwei Fu5 , Shenghua Gao3,4 1 Shanghai Tech University 2 Nanyang Technological University 3 The University of Hong Kong 4 HKU Shanghai Intelligent Computing Research Center 5 Fudan University 6 Transcengram EMAIL EMAIL EMAIL EMAIL EMAIL
Pseudocode No The paper describes the method and its steps in prose, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not contain an explicit statement about releasing source code or a link to a code repository.
Open Datasets Yes Dataset. For the evaluation of our approach from the reconstruction aspect and the object removal aspect, we adopt real-world street scenes from Waymo Open Perception Dataset Sun et al. (2020) and Pandaset Xiao et al. (2021). We downscale the resolution to 484 320. The Pandaset collects data from 6 camera perspectives, encompassing 360 degrees in FOV. We downscale the resolution to 480 270. We select front-view video sequences as the same experimental setup in Yan et al. (2024); Chen et al. (2023b); Zhou et al. (2024), using 24 scenes from Waymo and 9 scenes from Pandaset for our experiments.
Dataset Splits No We select front-view video sequences as the same experimental setup in Yan et al. (2024); Chen et al. (2023b); Zhou et al. (2024), using 24 scenes from Waymo and 9 scenes from Pandaset for our experiments. This mentions the number of scenes used, but not explicit training/test/validation splits (e.g., percentages or frame counts) for the datasets.
Hardware Specification Yes Our experiments were conducted on a single NVIDIA A40 GPU with peak memory usage of 16GB.
Software Dependencies No The paper mentions several software tools and models, such as Seg Former, Left Refill, SDXL, Pro Painter, and Open3D, but does not provide specific version numbers for these software dependencies. For example, it mentions "Seg Former Xie et al. (2021)" but not "Seg Former vX.Y".
Experiment Setup Yes We empirically set λd = 100, λn = 0.05, λds = 100, λs = 0.1, and λα = 0.001. The threshold is set as 0.99 in our implementation. We prune the Gaussians with opacity lower than a threshold ϵ to further eliminate the noisy semantics in the 3D world, with ϵ set as 0.3 in our experiments.