reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

SOVGaussian: Sparse-View 3D Gaussian Splatting for Open-Vocabulary Scene Understanding

Authors: Peng Ling, Tiao Tan, Jiaqi Lin, Wenming Yang

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our comprehensive experiments demonstrate that SOVGaussian is able to reconstruct a superior scene representation from few-shot images, outperforming existing state-of-the-art methods and achieving significantly better performance on novel view language querying and synthesis. Experimental results demonstrate that our method out-performs existing state-of-the-art methods, achieving up to a 56.9% improvement in m Io U compared to Lang Splat on the 3DOVS dataset and up to a 36% improvement on the DTU dataset. Ablation Study: Here, we conduct ablations on the 3DOVS dataset to evaluate the performance increment contributed by each component, including open-vocabulary querying accuracy and synthesis quality from novel views.
Researcher Affiliation	Academia	Shenzhen International Graduate School, Tsinghua University EMAIL, EMAIL
Pseudocode	No	The paper describes the methodology using textual explanations and mathematical equations (e.g., Equations 1-16), but it does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Repository https://github.com/Brucess/SOVGaussian
Open Datasets	Yes	We evaluate our method on the 3DOVS (Liu et al. 2023) and DTU datasets (Aanæs et al. 2016).
Dataset Splits	Yes	Different from their vanilla pipelines that use all views (i.e., 35 for 3DOVS and 49 for DTU) for training, we use only 3 views and evaluate generalization on novel views. To ensure fair comparison, all methods are trained following the same sparse-view protocol as ours, using the same 3 input views, camera poses, and test views. View selection follows uniform sampling for 3DOVS and the protocol in (Li et al. 2024) for DTU.
Hardware Specification	Yes	We train for 20,000 iterations on the 3DOVS dataset and 6,000 iterations on the DTU dataset using a single RTX 3090, requiring approximately 1 hour and 25 minutes, respectively, using around 4GB of memory.
Software Dependencies	No	Our approach is based on 3DGS (Kerbl et al. 2023) and implemented by Py Torch. While PyTorch is mentioned, no specific version number is provided for it or any other software dependency.
Experiment Setup	Yes	We train for 20,000 iterations on the 3DOVS dataset and 6,000 iterations on the DTU dataset... We empirically set γ to 20, τ = 5% for LOP, and λ = 0.1 for the loss function. We set the interval for LOP to 1000 iterations... We further control hyperparameters such as learning rate and density increment percentage to enhance the baselines performance.