Flow Distillation Sampling: Regularizing 3D Gaussians with Pre-trained Matching Priors

Authors: Lin-Zhuo Chen, Kangjie Liu, Youtian Lin, Zhihao Li, Siyu Zhu, Xun Cao, Yao Yao

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive experiments in depth rendering, mesh reconstruction, and novel view synthesis showcase the significant advantages of FDS over state-of-the-art methods. Additionally, our interpretive experiments and analysis aim to shed light on the effects of FDS on geometric accuracy and rendering quality, potentially providing readers with insights into its performance. Project page: https://nju-3dv.github.io/projects/fds . 4 EXPERIMENTS 4.1.1 IMPLEMENTATION DETAILS We apply our FDS method to two types of 3DGS: the original 3DGS, and 2DGS (Huang et al., 2024). The number of iterations in our optimization process is 35,000. We follow the default training configuration and apply our FDS method after 15,000 iterations, then we add normal consistency loss for both 3DGS and 2DGS after 25000 iterations. The weight for FDS, λfds, is set to 0.015, the σ is set to 23, and the weight for normal consistency is set to 0.15 for all experiments. We removed the depth distortion loss in 2DGS because we found that it degrades its results in indoor scenes. The Gaussian point cloud is initialized using Colmap for all datasets. We tested the impact of using Sea Raft (Wang et al., 2024) and Raft (Teed and Deng, 2020) on FDS performance. Due to the blurriness of the Scan Net dataset, additional prior constraints are required. Thus, we incorporate normal prior supervision on the rendered normals in Scan Net (V2) dataset by default. The normal prior is predicted by the Stable Normal model (Ye et al., 2024) across all types of 3DGS. The entire framework is implemented in Py Torch (Paszke et al., 2019), and all experiments are conducted on a single NVIDIA 4090D GPU.
Researcher Affiliation Collaboration Lin-Zhuo Chen Nanjing University Kangjie Liu Nanjing University Youtian Lin Nanjing University Zhihao Li Huawei Noah s Ark Lab Siyu Zhu Fudan University Xun Cao Nanjing University Yao Yao Nanjing University
Pseudocode Yes Algorithm 1 Flow Distillation Sampling Input: A batch of input training image: {Ii}N i=1, Transformation Matrix: {Ti}N i=1, Prior Matching Network Mθ, Gaussian Points {Pi}M i=1 with {ri, ti, fi, µi, αi}. Output: Lfds 1: for i in {1, 2, . . . , B} do 2: ξ U(0, 1), R I 3: t1 σ Di f sin(2πξ), t2 σ Di f cos(2πξ), t3 0 4: E [R, t], T s ((T i) 1E) 1 5: Cs, Ds Render(T s, P ) 6: Ci, Di Render(T i, P ) 7: Xs KT s i K 1Di(Xi)Xi/D(Xs) 8: F i s Xs Xi 9: F i s Mθ(Ii, Cs) 10: Lfds Lfds + 1/B||F i s F i s||2 11: return Lfds
Open Source Code No Project page: https://nju-3dv.github.io/projects/fds .
Open Datasets Yes The proposed FDS has been extensively evaluated on Mush Room (Ren et al., 2024), Scan Net (V2) (Dai et al., 2017), and Replica (Straub et al., 2019) datasets for the task of geometry reconstruction.
Dataset Splits Yes We train our model on the training split of the long capture sequence and evaluate novel view synthesis on the test split of the long capture sequences. Five scenes are selected to evaluate our FDS, including "coffee room", "honka", "kokko", "sauna", and "vr room". Scan Net(V2) (Dai et al., 2017) consists of 1,613 indoor scenes with annotated camera poses and depth maps. We select 5 scenes from the Scan Net (V2) dataset, uniformly sampling one-tenth of the views, following the approach in (Guo et al., 2022). To further improve the geometry rendering quality of 3DGS, Replica (Straub et al., 2019) contains small-scale real-world indoor scans. We evaluate our FDS on five scenes from Replica: office0, office1, office2, office3 and office4, selecting one-tenth of the views for training.
Hardware Specification Yes The entire framework is implemented in Py Torch (Paszke et al., 2019), and all experiments are conducted on a single NVIDIA 4090D GPU.
Software Dependencies No The entire framework is implemented in Py Torch (Paszke et al., 2019)
Experiment Setup Yes The number of iterations in our optimization process is 35,000. We follow the default training configuration and apply our FDS method after 15,000 iterations, then we add normal consistency loss for both 3DGS and 2DGS after 25000 iterations. The weight for FDS, λfds, is set to 0.015, the σ is set to 23, and the weight for normal consistency is set to 0.15 for all experiments. We removed the depth distortion loss in 2DGS because we found that it degrades its results in indoor scenes. The Gaussian point cloud is initialized using Colmap for all datasets.