reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

High-quality Text-to-3D Character Generation with SparseCubes and Sparse Transformers.

Authors: Jiachen Qian, Hongye Yang, Shuang Wu, Jingxi Xu, Feihu Zhang

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we conduct both qualitative and quantitative experiments to compare our method with existing state-of-the-art 3D generation methods.
Researcher Affiliation	Industry	Jiachen Qian1, Hongye Yang1, Shuang Wu1,2, Jingxi Xu1, Feihu Zhang1 1Dream Tech, 2Nanjing University
Pseudocode	No	The paper describes the proposed method and network architecture in detail but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide an explicit statement about releasing code or a link to a code repository.
Open Datasets	Yes	We train our method on the LVIS subset of the Objaverse dataset(Deitke et al., 2023). We test our method on the Render People dataset (ren, 2018)
Dataset Splits	Yes	We use 20K anime characters to train our models. The test set includes data of 30 randomly selected anime characters.
Hardware Specification	Yes	we ﬁrst optimize the coarse proposal network using Eq. 2 for 121 hours with 32 A100 GPUs (30k iterations).
Software Dependencies	No	The paper mentions several frameworks and models like PIXART-Σ (Chen et al., 2024), DINO (Caron et al., 2021), Flash Attention (Dao et al., 2022) in x Formers (Lefaudeux et al., 2022), but it does not specify version numbers for Python, PyTorch, CUDA, or other key software dependencies.
Experiment Setup	Yes	During the training of the Coarse Proposal Network, we set λlpips and λmask to 2, and λdepth and λnormal to 1. For the training of the Sparse Cube Transformer, we set λlpips and λnormal to 1, λmask to 8, and λdepth to 20. ...we ﬁrst optimize the coarse proposal network using Eq. 2 for 121 hours with 32 A100 GPUs (30k iterations). The batch size is 5 and the learning rate is 4e-4 with a cosine decay. ...train the Sparse Cube Transformer with L2 loss for 14k iterations. Finally, we start to optimize the Sparse Cube Transformer using the same loss as the coarse proposal network with a smaller learning rate of 5e-5 and a batch size of 2.