RefinedFields: Radiance Fields Refinement for Planar Scene Representations

Authors: Karim Kassab, Antoine Schnepf, Jean-Yves Franceschi, Laurent Caraffa, Jeremie Mary, Valerie Gouet-Brunet

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We carry out extensive experiments and verify the merit of our method on synthetic data and real tourism photo collections. Refined Fields enhances rendered scenes with richer details and improves upon its base representation on the task of novel view synthesis. We conduct extensive quantitative and qualitative evaluations of Refined Fields. We show that our method improves upon K-Planes with richer details in scene renderings. We prove via ablation studies that this added value indeed comes from the fine-tuned prior of the pre-trained network. Figures 1 and 4 show qualitative comparisons of Refined Fields with K-Planes, showing the visual improvements brought by our refining pipeline, which brings finer details to monuments in the Phototourism scenes. We start by assessing Refined Fields via an experiment on a case study. We then evaluate Refined Fields on synthetic scenes (Mildenhall et al., 2020) and real-world Phototourism (Jin et al., 2020) scenes, where we showcase the improvements our method exhibits relative to our K-Planes base representation. Quantitative results can be found in Tables 2 and 3, where we report for each experiment the Peak Signal-to-Noise Ratio (PSNR) for pixel-level similarity, the Structural Similarity Index Measure (SSIM) for structural-level similarity, and the Learned Perceptual Image Patch Similarity (Zhang et al., 2018, LPIPS) for perceptual similarity.
Researcher Affiliation Collaboration Karim Kassab EMAIL Criteo AI Lab, Paris, France LASTIG, Université Gustave Eiffel, IGN-ENSG, F-94160 Saint-Mandé Antoine Schnepf EMAIL Criteo AI Lab, Paris, France Université Côte d Azur, CNRS, I3S, France Jean-Yves Franceschi EMAIL Criteo AI Lab, Paris, France Laurent Caraffa EMAIL LASTIG, Université Gustave Eiffel, IGN-ENSG, F-94160 Saint-Mandé Jeremie Mary EMAIL Criteo AI Lab, Paris, France Valerie Gouet-Brunet EMAIL LASTIG, Université Gustave Eiffel, IGN-ENSG, F-94160 Saint-Mandé
Pseudocode Yes Algorithm 1 Alternating training algorithm.
Open Source Code Yes Our Python source code (tested on version 3.7.16), based on Py Torch (Paszke et al., 2019) (tested on version 1.13.1) and CUDA (tested on version 11.6), is publicly available as open source.
Open Datasets Yes We evaluate our method similarly to prior work (Mildenhall et al., 2020; Martin-Brualla et al., 2021; Fridovich-Keil et al., 2023) in novel view synthesis, by adopting the Real Synthetic 360 dataset (Mildenhall et al., 2020) for synthetic scenes and the same three scenes of cultural monuments from the Phototourism dataset (Jin et al., 2020) for real-world scenes: Brandenburg Gate, Sacré Coeur, and Trevi Fountain.
Dataset Splits Yes Consistently with prior work, 100 images are used for training each scene and 200 images are used for testing. All images are at 800 800 pixels. ... Testing is done on a standard set that is free of transient occluders.
Hardware Specification Yes We run all experiments on a single NVIDIA A100 GPU.
Software Dependencies Yes Our Python source code (tested on version 3.7.16), based on Py Torch (Paszke et al., 2019) (tested on version 1.13.1) and CUDA (tested on version 11.6), is publicly available as open source. We also utilize Diffusers (von Platen et al., 2022) and Stable Diffusion (tested on version 1-5, main revision). K-Planes also adopt the tinycudann framework (Müller, 2021).
Experiment Setup Yes A summary of our hyperparameters for synthetic as well as in-the-wild scenes can be found in Table 4. ... Hyperparameter Value Epochs (Nepochs) 200 (synthetic) 20 (Sacré Coeur) 20 (Brandenburg Gate) 10 (Trevi Fountain) Fitting iterations (N1) 30000 Refining iterations (N2) 3000 Batch size 4096 Optimizer Adam Scheduler Warmup Cosine K-Planes Learning Rate 0.01 Lo RA Learning rate 0.0001 Lo RA rank (r) 4 SD latent resolution 64 SD channel dimension 4 SD prompt Number of planes 3 K-Planes resolution 512 K-Planes channel dimension 32 Epochs Appearance Optimization 10 Appearance embeddings dimension 32 Appearance learning rate 0.1 (Sacré Coeur) 0.1 (Trevi Fountain) 0.001 (Brandenburg Gate) Appearance batch size 512