reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

STORM: Spatio-TempOral Reconstruction Model For Large-Scale Outdoor Scenes

Authors: Jiawei Yang, Jiahui Huang, Boris Ivanovic, Yuxiao Chen, Yan Wang, Boyi Li, Yurong You, Apoorva Sharma, Maximilian Igl, Peter Karkus, Danfei Xu, Yue Wang, Marco Pavone

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on public datasets show that STORM achieves precise dynamic scene reconstruction, surpassing state-of-the-art perscene optimization methods (+4.3 to 6.6 PSNR) and existing feed-forward approaches (+2.1 to 4.7 PSNR) in dynamic regions.
Researcher Affiliation	Collaboration	EMAIL, University of Southern California $ EMAIL, Georgia Institute of Technology EMAIL, Stanford University EMAIL, NVIDIA Research Equal advising.
Pseudocode	No	The paper describes the methodology in text and through figures but does not include any structured pseudocode or algorithm blocks.
Open Source Code	No	For more details, please visit our project page. The paper mentions a 'project page' but does not explicitly state that the source code for the described methodology is available there, nor does it provide a direct link to a code repository.
Open Datasets	Yes	We conduct extensive experiments on the Waymo Open dataset (Sun et al., 2020), Nu Scenes (Caesar et al., 2020) and Argoverse2 (Wilson et al.) to evaluate the performance of STORM.
Dataset Splits	Yes	We primarily conduct experiments on the Waymo Open Dataset (Sun et al., 2020), which contains 1,000 sequences of driving logs: 798 sequences for training and 202 for validation.
Hardware Specification	Yes	Speed metrics are estimated on a single A100 GPU.
Software Dependencies	No	Our GS backend is based on gsplat (Ye et al., 2024). For the LPIPS loss, we utilize a VGG-19-based (Simonyan & Zisserman, 2014) implementation. The paper mentions software components but does not provide specific version numbers for reproducibility.
Experiment Setup	Yes	We train our model for 100,000 iterations with a global batch size of 64 on NVIDIA A100 GPUs, using a learning rate of 4 10 4. The training process utilizes the Adam W optimizer (Loshchilov & Hutter, 2019) along with a cosine learning rate scheduler that includes a linear warmup phase over the first 5,000 iterations. We set λlpips to 0.05, λsky to 0.1, and λreg to 5e-3 in all experiments.