Fast Feedforward 3D Gaussian Splatting Compression

Authors: Yihang Chen, Qianyi Wu, Mengyao Li, Weiyao Lin, Mehrtash Harandi, Jianfei Cai

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments across various datasets demonstrate the effectiveness of FCGS, achieving a compression ratio of 20 while maintaining excellent fidelity, even surpassing most of the optimization-based methods.
Researcher Affiliation Academia Yihang Chen1,2, Qianyi Wu2 , Mengyao Li3,2, Weiyao Lin1 , Mehrtash Harandi2, Jianfei Cai2 1Shanghai Jiao Tong University, 2Monash University, 3Shanghai University
Pseudocode No The paper describes the method using textual descriptions and architectural diagrams (Figure 2, Figure 3), but no explicitly labeled 'Pseudocode' or 'Algorithm' block is present.
Open Source Code Yes Code: github.com/Yihang Chen-ee/FCGS.
Open Datasets Yes To achieve that, we refer to DL3DV dataset (Ling et al., 2024), which contains approximately 7K multi-view scenes. ... For 3DGS from optimization, we employ DL3DV-GS, Mip Ne RF360 (Barron et al., 2022), and Tank&Temples (Knapitsch et al., 2017) for evaluation. ... We utilize 10 scenes from ACID (Liu et al., 2021) and 50 scenes from Gobjaverse (Qiu et al., 2023; Deitke et al., 2022) for these two models for evaluation.
Dataset Splits Yes After filtering out low-quality ones, we obtain 6770 3DGS, and randomly split 100 for testing and the remaining for training. This dataset is referred to as DL3DV-GS.
Hardware Specification Yes Our FCGS model is implemented using the Py Torch framework (Paszke et al., 2019) and trained on a single NVIDIA L40s GPU.
Software Dependencies No Our FCGS model is implemented using the Py Torch framework (Paszke et al., 2019) and trained on a single NVIDIA L40s GPU. No specific version number for PyTorch or other software dependencies is provided.
Experiment Setup Yes The dimension of ˆy is set to 256 for color (m = 1). For ˆz, dimensions are set to 16, 24, and 64 for geometry, color (m = 0), and color (m = 1), respectively. Grid resolutions are {70, 80, 90} for 3D grids and {300, 400, 500} for 2D grids. We set N s to 4, using uneven splitting ratios of { 1 3}, with uniform random sampling. m is set to 0.01. In inference, we maintain a same random seed in encoding and decoding to guarantee consistency. The training batch size is 1 (i.e., one 3DGS scene per training step). We adjust λ from 1e 4 to 16e 4 to achieve variable bitrates.