reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Unsupervised Discovery and Composition of Object Light Fields

Authors: Cameron Omid Smith, Hong-Xing Yu, Sergey Zakharov, Fredo Durand, Joshua B. Tenenbaum, Jiajun Wu, Vincent Sitzmann

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 Experiments We demonstrate that compositional neural light fields, unconstrained by the sampling requirements of volumetric rendering, outperform prior work on unsupervised learning of object-centric 3D representations while dramatically reducing time and memory complexity. We further demonstrate that object-centric light fields admit scene editing in the from of translation and composing, and allow rendering of scenes with tens of objects at interactive frame-rates. Please find further qualitative results, including video, in the supplemental material. ... Table 1: Quantitative Comparison. Our method outperforms state-of-the-art baselines Yu et al. (2021) across all reconstruction quality metrics, while being orders of magnitude faster and requiring less memory.
Researcher Affiliation	Collaboration	1MIT CSAIL 2 Stanford University 3 Toyota Research Institute 4MIT BCS 5 CBMM
Pseudocode	No	The pseudo-code describing the background-aware slot encoding is the same as in u ORF, but exists in the supplemental material for reference.
Open Source Code	Yes	cameronosmith.github.io/colf
Open Datasets	Yes	CLEVR-657: The first room-scene dataset proposed by (Yu et al., 2021) is a 3D extension to the CLEVR (Johnson et al., 2017) dataset. ... ShapeNet chairs (Chang et al., 2015)
Dataset Splits	Yes	CLEVR-657: There are 1,000 scenes for training and 500 for testing. ... Room-Chair: There are 1,000 scenes for training and 500 for testing. ... Room-Diverse: There are 5,000 scenes for training and 500 for testing. ... City-Block: There are 500 scenes for training and we render out one scene for qualitative demonstration.
Hardware Specification	No	The paper does not provide specific details about the CPU or GPU models, memory, or cloud resources used for the experiments, only general statements like "capacity of even large GPUs".
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies or libraries used for implementation (e.g., Python, PyTorch, TensorFlow, CUDA).
Experiment Setup	Yes	On CLEVR-567, we set the background latent vector to 0, since the model needs no information about the unchanging background. This leads to faster model convergence. ... We supervise the model with the L2 reconstruction loss Iquery I 2, where Iquery are the ground truth views. We use a deep-feature based perceptual loss (Zhang et al., 2018) on both chair datasets to avoid inherent ambiguities in estimating lighting and geometry at occluded views. Lastly, we impose a small penalty z 2 on each object regressed code z to enforce a Gaussian prior. ... Lastly, we initially render and supervise images at 64 64 resolution to efficiently learn the coarse structure and decomposition of scenes, and subsequently supervise at 128 128 to learn more fine object structure.