Spatial Annealing for Efficient Few-shot Neural Rendering
Authors: Yuru Xiao, Deming Zhai, Wenbo Zhao, Kui Jiang, Junjun Jiang, Xianming Liu
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our extensive experiments reveal that, by adding merely one line of code, SANe RF delivers superior rendering quality and much faster reconstruction speed compared to current few-shot neural rendering methods. Extensive experiments on synthetic datasets and Blender (Mildenhall et al. 2021) in a few-shot context demonstrate the effectiveness and efficiency of our approach. We present extensive experimental results demonstrating that our method achieves superior efficiency and enhanced performance. Datasets and Metrics. We evaluate our method using the Blender dataset (Mildenhall et al. 2021)... For quantitative analysis, we report the average scores across all test scenes for PSNR, SSIM, and LPIPS. Ablation Studies To thoroughly assess the efficacy of our proposed method, we conduct qualitative and quantitative ablation studies... |
| Researcher Affiliation | Academia | Yuru Xiao, Deming Zhai*, Wenbo Zhao, Kui Jiang, Junjun Jiang, Xianming Liu Harbin Institute of Technology EMAIL, (zhaideming, wbzhao, jiangkui, jiangjunjun, csxm)@hit.edu.cn |
| Pseudocode | No | The paper describes the methodology using mathematical formulations and descriptive text, but it does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper states: "We implement our methodology based on the Tri Mip RF codebase (Hu et al. 2023), utilizing the Py Torch framework (Paszke et al. 2019)." However, there is no explicit statement about releasing the code for the method described in this paper, nor is there a link to a code repository. |
| Open Datasets | Yes | Datasets and Metrics. We evaluate our method using the Blender dataset (Mildenhall et al. 2021), which comprises 8 synthetic scenes observed from a surround view perspective. Fig. 6 and Tab. 2 present qualitative and quantitative comparisons of our method with Free Ne RF (Yang, Pavone, and Wang 2023). Quantitatively, our method matches or slightly outperforms Free Ne RF, showing up to a 0.05 d B and 0.1 d B increase in PSNR on the LLFF dataset (Mildenhall et al. 2019) and the DTU dataset (Jensen et al. 2014), respectively. |
| Dataset Splits | Yes | Consistent with the Free Ne RF (Yang, Pavone, and Wang 2023) and Diet Ne RF (Jain, Tancik, and Abbeel 2021), we train our model on 8 views identified by the IDs 26, 86, 2, 55, 75, 93, 16, and 73. The evaluation is conducted on 25 images, evenly selected from the test set. We use the first n images from the Blender s training set as input, where n is varied among 8, 20, 40, 60, 80, and 100, with the dataset containing a total of 100 images. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. It only mentions the software framework used for implementation. |
| Software Dependencies | No | The paper mentions "Py Torch framework (Paszke et al. 2019)" and "JAX Mip Ne RF framework (Barron et al. 2021)" as bases for their implementation. However, it does not specify concrete version numbers for PyTorch, JAX, or any other critical software dependencies. |
| Experiment Setup | Yes | Specifically, we initialize the sphere size fs at 0.15, set the decrease rate ϑ to 0.2, define the total number of decrement steps Nsplit as 30, and establish the stop point T at 2K. In the few-shot setting, there is a significant reduction in the number of input rays. Consequently, we limit the maximum training steps for both our method and the baseline Tri Mip RF to one-tenth of those employed in Tri Mip RF s full-view setting. We train our model using the Adam W optimizer (Loshchilov and Hutter 2017) for 2.5K iterations, with a learning rate of 2 10 3 and a weight decay of 1 10 5. Regarding the degree truncation in Spherical Harmonic Encoding, we consistently set the truncated level n to 2 across all experiments. |