ROOTS: Object-Centric Representation and Rendering of 3D Scenes

Authors: Chang Chen, Fei Deng, Sungjin Ahn

JMLR 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In experiments, in addition to generation quality, we also demonstrate that the learned representation permits object-wise manipulation and novel scene generation, and generalizes to various settings. Results can be found on our project website: https://sites.google.com/view/roots3d. Keywords: object-centric representations, latent variable models, 3D scene generation, variational inference, 3D-aware representations
Researcher Affiliation Academia Chang Chen EMAIL Department of Computer Science Rutgers University Piscataway, NJ 08854, USA
Pseudocode Yes Appendix D. Summary of ROOTS Encoder and Decoder Algorithm 1 ROOTS Encoder Algorithm 2 ROOTS Decoder
Open Source Code No The paper states: "Results can be found on our project website: https://sites.google.com/view/roots3d." However, this only refers to results and does not explicitly state that the source code for the methodology is available.
Open Datasets Yes For evaluation on realistic objects, we also included a publicly available Shape Net arrangement data set (Tung et al., 2019; Cheng et al., 2018).
Dataset Splits Yes Both data sets contain 60K multi-object scenes (50K for training, 5K for validation, and 5K for testing) with complete groundtruth scene specifications including object shapes, colors, positions, and sizes. Each scene is rendered as 128 128 color images from 30 random viewpoints. During training, we sample 10-20 viewpoints uniformly at random as contexts and use the rest as queries. For evaluation and visualization, we use 15 viewpoints as contexts and the rest as queries.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, processors, or memory used for running the experiments.
Software Dependencies No The paper mentions software components like "RMSprop (Hinton et al., 2012)", "Conv DRAW (Gregor et al., 2016)", "Adam (Kingma and Ba, 2015)" and refers to "the official implementation" of IODINE, but does not provide specific version numbers for any of these software dependencies.
Experiment Setup Yes ROOTS is trained for 200 epochs with a batch size of 12, using RMSprop (Hinton et al., 2012) with learning rates chosen from {1 10 3, 3 10 4, 1 10 4, 3 10 5}. We set γ = 7 and ρ = 0.999 during training, thereby encouraging the model to decompose the scenes into as few objects as possible.