Locality-aware Gaussian Compression for Fast and High-quality Rendering
Authors: Seungjoo Shin, Jaesik Park, Sunghyun Cho
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that our approach outperforms the rendering quality of existing compact Gaussian representations for representative real-world 3D datasets while achieving from 54.6 to 96.6 compressed storage size and from 2.1 to 2.4 rendering speed than 3DGS. Even our approach also demonstrates an averaged 2.4 higher rendering speed than the state-of-the-art compression method with comparable compression performance. |
| Researcher Affiliation | Academia | Seungjoo Shin1 Jaesik Park2 Sunghyun Cho1 1POSTECH 2Seoul National University |
| Pseudocode | Yes | A.4 ALGORITHMS We provide the locality-aware Gaussian representation learning step of Loco GS pipeline. Algorithm 1 Locality-aware Gaussian representation learning Algorithm 2 Acquisition of Gaussians during optimization Algorithm 3 Masking Gaussians during optimization Algorithm 4 Pruning Gaussian during optimization |
| Open Source Code | No | The paper mentions 'https://seungjooshin.github.io/Loco GS' which is a project homepage. However, it does not explicitly state that the source code for the methodology described in this paper is available at this link, nor is it a direct link to a code repository. |
| Open Datasets | Yes | For evaluation, we adopt three representative benchmark novel-view synthesis datasets. Specifically, we employ three real-world datasets: Mip-Ne RF 360 (Barron et al., 2022), Tanks and Temples (Knapitsch et al., 2017), and Deep Blending (Hedman et al., 2018). Following 3DGS (Kerbl et al., 2023), we evaluate our method on nine scenes from Mip-Ne RF 360, two from Tanks and Temples, and two from Deep Blending. |
| Dataset Splits | Yes | For evaluation, we adopt three representative benchmark novel-view synthesis datasets. Specifically, we employ three real-world datasets: Mip-Ne RF 360 (Barron et al., 2022), Tanks and Temples (Knapitsch et al., 2017), and Deep Blending (Hedman etal., 2018). Following 3DGS (Kerbl et al., 2023), we evaluate our method on nine scenes from Mip-Ne RF 360, two from Tanks and Temples, and two from Deep Blending. |
| Hardware Specification | Yes | Following the optimization scheme of 3DGS (Kerbl et al., 2023), the optimization process of Loco GS adopts 30K iterations to construct a 3D representation for each scene, which takes about one hour for a single scene on an NVIDIA RTX 6000 Ada GPU. Rendering Speed Measurement For a fair comparison of rendering speed, we reproduce all the baselines following the outlined configuration and measure the rendering speed in the same experimental setting. We render every test image 100 times on a single GPU (NVIDIA RTX 3090) and report the average rendering time. |
| Software Dependencies | No | The paper mentions several frameworks and tools used (e.g., 3DGS, COLMAP, Nerfacto, Adam optimizer), but does not provide specific version numbers for software libraries or dependencies like Python, PyTorch, or CUDA versions. |
| Experiment Setup | Yes | Specifically, we optimize a loss L defined as: L = (1 λ)L1 + λLSSIM + λmask Lmask + λSH mask LSH mask, (4) where L1 and LSSIM represent an L1 loss and an SSIM loss... λ, λmask, and λSH mask are balancing weights. We optimize Nerfacto (Tancik et al., 2023) for 30K iterations... We set the rendering loss weight λ = 0.2, the mask threshold to τ = 0.01 and the masking loss weight to λmask = 0.005 for our small variant and λmask = 0.004 for our base model. We set the SH mask threshold to τSH = 0.01, and SH masking loss weight to λSH mask = 0.0001. We apply 6-bit uniform quantization for hash grid parameters θ and base scales γ and apply 8-bit uniform quantization for base colors k0. Prior to k-bit quantization, we clip the target values to lie within 3 + 3(k 1)/15 standard deviations of their mean, as proposed by Dupont et al. (2022). |