FastLGS: Speeding Up Language Embedded Gaussians with Feature Grid Mapping
Authors: Yuzhou Ji, He Zhu, Junshu Tang, Wuyi Liu, Zhizhong Zhang, Xin Tan, Yuan Xie
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we first show the speed and quality of open-vocabulary object retrieval in comparison with other state-of-the-art methods through quantitative experiments, then we provide qualitative results of downstream tasks including 3D segmentation and object deletion. Ablation studies are conducted to demonstrate the rationality of feature grid based design. Datasets For quantitative experiments, we train and evaluate the models on datasets including SPIn-Ne RF (Mirzaei et al. 2023), LERF (Kerr et al. 2023) and 3D-OVS (Liu et al. 2023). |
| Researcher Affiliation | Academia | Yuzhou Ji1*, He Zhu1*, Junshu Tang2, Wuyi Liu1, Zhizhong Zhang1,3, Xin Tan1 , Yuan Xie1 1East China Normal University, Shanghai, China 2Shanghai Jiao Tong University, Shanghai, China 3Shanghai Key Laboratory of Computer Software Evaluating and Testing EMAIL |
| Pseudocode | Yes | Algorithm 1: Cross View Grid Mapping Input: Image sequence {It|t = 0, 1, ..., T}, segmentation masks {Mi,j|i = 0, 1, ..., T; j = 0, 1, ..., mi}, CLIP embedding {Li,j|i = 0, 1, ..., T; j = 0, 1, ..., mi} and color distribution {Ci,j|i = 0, 1, ..., T; j = 0, 1, ..., mi}, index of mask Mi,j s low dimensional feature Idxi,j, threshold τ and θ Parameter: index of image i, mask s index of image Ii s mask j, index of the most corresponding mask k, images for matching I , masks for matching M Output: low dimensional feature indices Idx of masks M and the number K of objects in a scene 1: K, Idx0 = initialise(I0, M0, L0) 2: I = {I0}; M = {M0,i|i = 0, 1, ..., m0}. 3: Let i = 1. 4: while i T do 5: corr Info = correspond Kp(Ii, I , M , ) 6: while j mi do 7: k, M k, num Kp = sel Most Corr Mask(Mi,j, corr Info) 8: if num Kp τ then 9: Idxi,j = low-dim feature index of M k 10: else 11: simsi,j = compute Sims(Mi,j, M , L, C) 12: k, M k, sim = sel Most Sim Mask(Mi,j, M , simsi,j) 13: if sim θ then 14: Idxi,j = low-dim feature index of M k 15: else 16: K = K + 1; Idxi,j = K 17: end if 18: end if 19: I = union(I , {Ii}) 20: M = union(M , {Mi,j|j = 0, 1, ..., mi}) 21: j = j + 1 22: end while 23: i = i + 1 24: end while |
| Open Source Code | Yes | Project page https://george-attano.github.io/Fast LGS |
| Open Datasets | Yes | For quantitative experiments, we train and evaluate the models on datasets including SPIn-Ne RF (Mirzaei et al. 2023), LERF (Kerr et al. 2023) and 3D-OVS (Liu et al. 2023). |
| Dataset Splits | No | The paper mentions using SPIn-Ne RF, LERF, and 3D-OVS datasets for training and evaluation. However, it does not explicitly provide details about how these datasets were split (e.g., percentages for training, validation, and testing, or references to predefined splits used for reproduction). It states 'We evaluate the Io U and pixel-wise accuracy of masks with provided ground truth (1008 567)' but this refers to ground truth data, not the experimental splits. |
| Hardware Specification | Yes | All results are reported running on a single TITAN RTX GPU. |
| Software Dependencies | No | The paper mentions using specific models like "Open Clip Vi TB/16 model" and "SAM Vi T-H model" but does not provide version numbers for any software libraries, programming languages, or development environments used (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | We train the features and scenes in 3DGS for 30,000 iterations. Activation threshold τac, correspondence threshold τ and θ are set to 5.0, 4 and 0.95. Weight α is set to 0.3. f is normalized to (0, 255)3. |