reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

TranSplat: Generalizable 3D Gaussian Splatting from Sparse Multi-View Images with Transformers

Authors: Chuanrui Zhang, Yingshuang Zou, Zhuoling Li, Minmin Yi, Haoqian Wang

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate Tran Splat on two large-scale benchmarks: Real Estate10K (Zhou et al. 2018) and ACID (Liu et al. 2021a). Extensive experiments are conducted and demonstrate that Tran Splat achieves the best results in G-3DGS. Notably, compared to existing counterparts , Tran Splat presents strong cross-dataset generalization ability. Comprehensively, our main contributions are as follows: We propose to utilize the depth confidence map to enhance matching between various views and correspondingly significantly improve the reconstruction precision in regions with insufficient texture or repetitive patterns. We propose a strategy that encodes the priors of monocular depth estimators into the prediction of Gaussian parameters, ensuring precise 3D Gaussian centers are estimated even in non-overlapping areas. The derived method Tran Splat achieves the best results on two large-scale benchmarks and presents strong cross-dataset generalization ability.
Researcher Affiliation	Collaboration	Chuanrui Zhang1, Yingshuang Zou1, Zhuoling Li2, Minmin Yi3, Haoqian Wang1 1Tsinghua University, 2The University of Hong Kong, 3E-surfing Vision Technology Co., Ltd EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes methods through figures (Figure 2, Figure 3, Figure 4) and textual explanations but does not contain explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	Project Page: https://xingyoujun.github.io/transplat/ (Visiting this project page explicitly states: 'Code will be released soon.')
Open Datasets	Yes	We evaluate Tran Splat on two large-scale benchmarks: Real Estate10K (Zhou et al. 2018) and ACID (Liu et al. 2021a) datasets. Additionally, to assess cross-dataset generalization, we evaluate all methods on the multi-view DTU dataset (Jensen et al. 2014).
Dataset Splits	Yes	Real Estate10K comprises home walkthrough videos from You Tube, with 67,477 scenes for training and 7,289 scenes for testing. The ACID dataset, featuring aerial landscape videos, includes 11,075 training scenes and 1,972 testing scenes.
Hardware Specification	Yes	All models are trained with a batch size of 14 on 7 RTX 3090 GPUs for 300,000 iterations using the Adam (Kingma 2014) optimizer. During inference, we measure speed and memory cost with one RTX 3090 GPU.
Software Dependencies	No	The paper mentions models, optimizers (Adam), and frameworks (Swin Transformer, Depth Anything V2, diffusion models) but does not provide specific version numbers for any software libraries or dependencies used for implementation.
Experiment Setup	Yes	Input images are resized to 256 256, following the method outlined in (Chen et al. 2024). In all experiments, the number of depth candidates is set to 128. We sample P = 4 deformable points in the Depth-Aware Deformable Matching Transformer for the main results. For the Depth Anything V2 (Yang et al. 2024) module, we use the base size to balance training cost and result quality. All models are trained with a batch size of 14 on 7 RTX 3090 GPUs for 300,000 iterations using the Adam (Kingma 2014) optimizer.