reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Bringing NeRFs to the Latent Space: Inverse Graphics Autoencoder

Authors: Antoine Schnepf, Karim Kassab, Jean-Yves Franceschi, Laurent Caraffa, Flavian Vasile, Jeremie Mary, Andrew Comport, Valerie Gouet-Brunet

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We experimentally confirm that Latent Ne RFs trained with IG-AE present an improved quality compared to a standard autoencoder, all while exhibiting training and rendering accelerations with respect to Ne RFs trained in the image space. Our project page can be found at https://ig-ae.github.io .
Researcher Affiliation	Collaboration	Antoine Schnepf * 1,2, Karim Kassab* 1,3, Jean-Yves Franceschi1, Laurent Caraffa3, Flavian Vasile1, Jeremie Mary1, Andrew Comport 2, Val erie Gouet-Brunet 3 * Equal contribution 1 Criteo AI Lab, Paris, France 2 Universit e Cˆote d Azur, CNRS, I3S, France 3 LASTIG, Universit e Gustave Eiffel, IGN-ENSG, F-94160 Saint-Mand e
Pseudocode	No	The paper describes the methodology and training process using mathematical equations and prose, but does not include any clearly labeled pseudocode or algorithm blocks. For example, Section 3 describes "LATENT NERF" and its training process.
Open Source Code	Yes	We utilize the trained IG-AE to bring Ne RFs to the latent space with a latent Ne RF training pipeline, which we implement in an open-source extension of the Nerfstudio framework, thereby unlocking latent scene learning for its supported methods. [...] Our code is open-source and available on the following Git Hub repository: https://github.com/Antoine Schnepf/ latent-nerfstudio . The training code for IG-AE is open-source and available on the following Git Hub repository: https://github.com/k-kassab/igae .
Open Datasets	Yes	For 3D-regularization, we adopt Objaverse (Deitke et al., 2023), a synthetic dataset which is standard when large-scale and diverse 3D data is needed (Liu et al., 2023; Shi et al., 2024). [...] For AE preservation, we adopt Imagenet (Deng et al., 2009), a large dataset of diverse real images. [...] For Ne RF evaluations, we utilize synthetic, object-level data as it aligns with the training domain. As such, we train Ne RFs on held-out scenes from Objaverse, and on scenes from three out-of-distribution datasets: Shapenet Hats, Bags, and Vases (Chang et al., 2015).
Dataset Splits	No	For 3D-regularization, we adopt Objaverse (Deitke et al., 2023), a synthetic dataset which is standard when large-scale and diverse 3D data is needed (Liu et al., 2023; Shi et al., 2024). We utilize N = 500 objects from Objaverse. Each object is rendered from V = 300 views at a 128 128 resolution. [...] For Ne RF evaluations, we utilize synthetic, object-level data as it aligns with the training domain. As such, we train Ne RFs on held-out scenes from Objaverse, and on scenes from three out-of-distribution datasets: Shapenet Hats, Bags, and Vases (Chang et al., 2015). [...] Table 1: Main Results on Shape Net datasets. All results are obtained by training Ne RFs with our Latent Ne RF Training Pipeline, and are averaged over 4 scenes from each dataset.
Hardware Specification	Yes	Training IG-AE takes 60 hours on 4 NVIDIA L4 GPUs. [...] Training and rendering time is measured using a single NVIDIA L4 GPU.
Software Dependencies	No	Nerfstudio (Tancik et al., 2023) emerged as a unified Py Torch (Paszke et al., 2019) framework in which Ne RF models are implemented using standardized implementations, making it straightforward for researchers and practitioners to integrate various Ne RF models into their projects. [...] We adopt the pre-trained Ostris KL-f8-d16 VAE (Burkett, 2024) from Hugging Face, which has a downscale factor l = 8, and c = 16 feature channels in the latent space.
Experiment Setup	Yes	To train a latent Ne RF in Nerfstudio, we first train the chosen model for 10 000 iterations to minimize LLS using the method-specific optimization process. Subsequently, we continue the training with 15 000 iterations of RGB alignment by minimizing Lalign. To account for the change of image representations, we modulate the learning rate of each method by a factor of ξLS in latent supervision, and a factor ξalign for RGB alignment. Appendix F.2 details the hyper-parameters we used in Nerfstudio, including the values of these factors for each method. [...] Appendix F.1: IG-AE TRAINING SETTINGS: Table 15 details the hyperparameters taken to train our IG-AE.