reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Tackling View-Dependent Semantics in 3D Language Gaussian Splatting

Authors: Jiazhong Cen, Xudong Zhou, Jiemin Fang, Changsong Wen, Lingxi Xie, Xiaopeng Zhang, Wei Shen, Qi Tian

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that La Ga effectively captures key information from view-dependent semantics, enabling a more comprehensive understanding of 3D scenes. Notably, under the same settings, La Ga achieves a significant improvement of +18.7% m Io U over the previous SOTA on the LERF-OVS dataset. Our code is available at: https: //github.com/https://github.com/ SJTU-Deep Vision Lab/La Ga.
Researcher Affiliation	Collaboration	1Mo E Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University 2Huawei Technologies Co., Ltd.. Correspondence to: Wei Shen <EMAIL>, Jiemin Fang <EMAIL>.
Pseudocode	Yes	B.4. Detailed Algorithm of the Cross-View Descriptor Extraction The pseudo code of the cross-view descriptor extraction is shown in Algorithm 1.
Open Source Code	Yes	Our code is available at: https: //github.com/https://github.com/ SJTU-Deep Vision Lab/La Ga.
Open Datasets	Yes	We evaluate La Ga on LERF-OVS (Kerr et al., 2023; Qin et al., 2024), 3D-OVS (Liu et al., 2023a), and Scan Net (Dai et al., 2017). LERF-OVS consists of complex 360 indoor scenes, while 3D-OVS features forward-facing scenes with long-tailed categories. Both datasets provide 2D annotations.
Dataset Splits	No	The paper mentions training on a 'training set I' and evaluates on datasets like LERF-OVS, 3D-OVS, and Scan Net. However, it does not explicitly provide specific training/test/validation split percentages, sample counts, or references to predefined splits for these datasets.
Hardware Specification	Yes	All experiments are conducted on a single NVIDIA RTX 3090 GPU.
Software Dependencies	No	The paper mentions using "Vi T-H model of SAM and the Open CLIP Vi T-B/16 model of CLIP" but does not specify software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup	Yes	For each scene, the 3D-GS model is trained for 30000 iterations, followed by 30000 iterations training of the Gaussian affinity features. For Scan Net, we apply a KNN-based local feature smoothing operation following SAGA (Cen et al., 2025a) during training the affinity features. During inference, in addition to the relevance score, we find that applying an auxiliary cosine similarity threshold (0.23) helps remove unwanted regions. For all remained objects in the scene, relevance scores are first min-max normalized. A 3D bilateral filtering step is then applied to the resulting 3D relevance map to suppress noise. Gaussians with relevance scores above 0.6 are classified as foreground.