reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

GenXD: Generating Any 3D and 4D Scenes

Authors: Yuyang Zhao, Chung-Ching Lin, Kevin Lin, Zhiwen Yan, Linjie Li, Zhengyuan Yang, Jianfeng Wang, Gim H Lee, Lijuan Wang

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform extensive evaluations across various real-world and synthetic datasets, demonstrating Gen XD s effectiveness and versatility compared to previous methods in 3D and 4D generation. 5 EXPERIMENT 5.1 EXPERIMENTAL SETUP Datasets. Gen XD is trained with the combination of 3D and 4D datasets.
Researcher Affiliation	Collaboration	National University of Singapore, Microsoft Corporation
Pseudocode	No	The paper does not contain explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Our curated 4D dataset, Cam Vid-30K, and Gen XD model will be made publicly available.
Open Datasets	Yes	This large-scale dataset, termed Cam Vid-30K, will be made available for public use. For 3D datasets, we leverage five datasets with camera pose annotation: Objaverse (Deitke et al., 2023), MVImage Net (Yu et al., 2023), Co3D (Reizenstein et al., 2021), Re10K (Zhou et al., 2018) and ACID (Liu et al., 2021). For 4D datasets, we leverage the synthetic data Objaverse XL-Animation (Deitke et al., 2024; Liang et al., 2024) and our Cam Vid-30K.
Dataset Splits	No	The paper mentions using specific datasets for training and evaluation but does not explicitly provide the comprehensive training/test/validation splits for the datasets used to train the main Gen XD model. For a specific experiment, it states '3 views in each scene are used for training' but does not detail the splits for testing or validation for those datasets, nor for the overall training of Gen XD.
Hardware Specification	Yes	The model is trained on 32 A100 GPUs with batch size 128 and resolution 256 256.
Software Dependencies	No	The paper mentions the use of Stable Video Diffusion as a pretrained model and Adam W optimizer, but does not provide specific version numbers for software dependencies such as libraries, frameworks, or programming languages.
Experiment Setup	Yes	Gen XD is trained in three stages. We first train the UNet only with 3D data for 500K iteration and then fine-tune it with both 3D and 4D data for 500K iterations in single view mode. Finally, Gen XD is trained with both single view and multi-view mode with all the data for 500K iteration. The model is trained on 32 A100 GPUs with batch size 128 and resolution 256 256. Adam W (Loshchilov & Hutter, 2019) optimizer with learning rate 5 10 5 is adopted.