High-Fidelity Polarimetric Implicit 3D Reconstruction with View-Dependent Physical Representation

Authors: Yu Qiu, Sijia Wen, Hainan Zhang, Zhiming Zheng

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate the superior performance of the proposed method in both real and synthetic scenarios. Experiments Dataset The experimental datasets include both a collection of objects with complex characteristics from real-world scenes and a publicly accessible synthetic dataset. For each object, we captured between 40 and 60 polarized images using a polarized color camera, FLIR BFS-U3-51S5PC-C. Comparisons We conducted experiments on both synthetic and real datasets to validate the effectiveness of our method. More comparison results on synthetic and real data can be found in the supplementary material. Quantitative and qualitative comparisons with competing models are respectively showcased in the referenced Tab. 1 and Tab. 2. We also show the visualization results in Figure 5. Ablation Study In this section, we designed ablation experiments to demonstrate the effectiveness of our polarized cues. Taking the hedgehog object as an example, the results of our ablation experiments are shown in Tab. 3.
Researcher Affiliation Academia Yu Qiu, Sijia Wen*, Hainan Zhang, Zhiming Zheng Beijing Advanced Innovation Center for Future Blockchain and Privacy Computing, School of Aritificial Intelligence, Beihang University, Beijing, China EMAIL, EMAIL
Pseudocode No The paper describes methods and algorithms in text and provides an overview architecture diagram in Figure 3, but it does not contain any structured pseudocode or algorithm blocks.
Open Source Code No Our implementation was carried out in Py Torch (Paszke et al. 2019), excluding any customized CUDA kernels. ... Further implementation details are in the supplement. The paper mentions implementation details in the supplement but does not explicitly state that the source code for the described methodology is publicly available, nor does it provide a link to a code repository.
Open Datasets Yes We utilize SMVP3D Dataset as the synthetic dataset, which includes reflective objects and was published by Ne RSP (Han et al. 2024).
Dataset Splits No For each object, we captured between 40 and 60 polarized images using a polarized color camera, FLIR BFS-U3-51S5PC-C. The dataset comprises original images, Stokes vectors, Ao LP maps, Do LP maps, masks, and camera pose data. ... We utilize SMVP3D Dataset as the synthetic dataset... The paper describes the acquisition of its own dataset and mentions using a synthetic dataset, but it does not specify any explicit training, validation, or test splits for either dataset.
Hardware Specification Yes The model was optimized for 3000 epochs, with a batch size of 2048 pixel rays. The learning rate we set was 1e-4, and all methods were executed on a single NVIDIA A100 GPU.
Software Dependencies No Our implementation was carried out in Py Torch (Paszke et al. 2019), excluding any customized CUDA kernels. The paper mentions PyTorch but does not provide a specific version number for it or any other software dependencies.
Experiment Setup Yes Our implementation was carried out in Py Torch (Paszke et al. 2019), excluding any customized CUDA kernels. The model was optimized for 3000 epochs, with a batch size of 2048 pixel rays. The learning rate we set was 1e-4, and all methods were executed on a single NVIDIA A100 GPU. Our training is divided into three stages. In the first stage, the algorithm constrains the fsdf using RGB and silhouette to obtain the initial shape, which requires 1500 epochs. In the second stage, we freeze the parameters of the fsdf and train three incident Stokes networks using real Stokes information captured by the camera, which requires 1000 epochs. In the third stage, we unfreeze the fsdf and perform joint optimization on the fsdf and three incident Stokes MLPs, which require 500 epochs. λc is set to 1 by default, λs is set to 1 in the second and third stages, and λaolp, like PIR, is set to 0.02 by default.