6DGS: Enhanced Direction-Aware Gaussian Splatting for Volumetric Rendering

Authors: Zhongpai Gao, Benjamin Planche, Meng Zheng, Anwesa Choudhuri, Terrence Chen, Ziyan Wu

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments demonstrate that 6DGS significantly outperforms 3DGS and N-DG, achieving up to a 15.73 d B improvement in PSNR with a reduction of 66.5% Gaussian points compared to 3DGS. The project page is: https://gaozhongpai.github.io/6dgs/.
Researcher Affiliation Industry Zhongpai Gao , Benjamin Planche, Meng Zheng, Anwesa Choudhuri, Terrence Chen, Ziyan Wu United Imaging Intelligence, Boston MA, USA {first.last}@uii-ai.com
Pseudocode Yes Algorithm 1 outlines the implementation details for converting (i.e., slicing) our 6DGS representation into a 3DGS-compatible format in a single function. Once this slicing operation is performed, the subsequent implementation remains identical to that of 3DGS. Listing 1: Python code for slicing 6DGS to conditional 3DGS
Open Source Code No The abstract mentions a project page URL: 'The project page is: https://gaozhongpai.github.io/6dgs/.' This is a high-level project overview page, not a direct link to a code repository. Although Python code snippets are provided in Listing 1, the full source code for the methodology is not explicitly stated to be released or linked.
Open Datasets Yes We evaluate 6DGS on two datasets: the public Synthetic Ne RF dataset (Mildenhall et al., 2020) and a custom dataset using physically-based rendering (PBR), which we refer to as the 6DGS-PBR dataset. We evaluate 6DGS on three real-world datasets: Deep Blending (Hedman et al., 2021), Tanks & Temples (Knapitsch et al., 2017), and Shiny (Wizadwongsa et al., 2021) to validate its robustness and effectiveness in practical scenarios, as shown in Table 5 and 6.
Dataset Splits Yes For the ct-scan object, we generated 360 views with corresponding camera poses and randomly selected 324 images for training and 36 for testing. For each of the other objects, we rendered 150 views, randomly selecting 100 images for training and 50 for testing.
Hardware Specification Yes Training is performed on a single NVIDIA Tesla V100 GPU with 16 GB of memory, using the Adam optimizer (Kingma & Ba, 2014). At the image sizes (with width equal to height) as listed in Table 2 on an NVIDIA Tesla V100 GPU, the rendering times per view were as follows:...
Software Dependencies No The paper mentions 'Flash GS (Feng et al., 2024), an open-source CUDA Python library' and 'implemented in CUDA'. However, it does not provide specific version numbers for Flash GS, CUDA, Python, or any other libraries or frameworks used in the implementation.
Experiment Setup Yes In our experiments, we set λopa = 0.35 and the minimum opacity threshold τ = 0.01. For learnable λopa, we initialize λopa = 0.35 and make it trainable only during the iterations of 15,000 28,000. All other parameters are set to their default values as in 3DGS (Kerbl et al., 2023). We set the learning rate to 1 10 2 for the 6D covariance parameters and 1 10 3 for the direction component µd. The default learning rates from 3DGS are applied to the remaining parameters. For the ct-scan object, we initialize the point cloud using the marching cubes algorithm as in DDGS (Gao et al., 2024). For the other objects and the Synthetic Ne RF dataset (Mildenhall et al., 2020), we randomly initialize the point cloud with 100,000 points within a cube encompassing the scene.