SCas4D: Structural Cascaded Optimization for Boosting Persistent 4D Novel View Synthesis

Authors: Jipeng Lyu, Jiahua Dong, Yu-Xiong Wang

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate our method s effectiveness in novel view synthesis and dense point tracking tasks. Extensive experiments highlight its strong performance in novel view synthesis and dense point tracking, with potential for further improvement through other 3D Gaussian Splatting variants. In this section, we conduct ablation studies to analyze the effectiveness of our method.
Researcher Affiliation Academia Jipeng Lyu EMAIL University of Illinois Urbana-Champaign Jiahua Dong EMAIL University of Illinois Urbana-Champaign Yu-Xiong Wang EMAIL University of Illinois Urbana-Champaign
Pseudocode Yes Algorithm 1 summarizes our training process.
Open Source Code No Please find our project page at https://github-tree-0.github.io/SCas4D-project-page/.
Open Datasets Yes We conduct our experiments on two datasets: the Panoptic dataset Joo et al. (2015; 2019), which includes six real-world dynamic scenes (Basketball, Boxes, Football, Juggle, Softball, and Tennis), and the synthetic Fast Particle dataset (Abou-Chakra et al., 2024), containing six highly dynamic scenes (Robot, Spring, Wheel, Pendulums, Robot-Task, and Cloth).
Dataset Splits Yes Each scene in this dataset contains 150 frames captured by a total of 31 cameras, with 27 cameras used for training and 4 for testing. This dataset includes 40 cameras in total, from which we randomly select 4 as testing cameras and the remaining 36 as training cameras.
Hardware Specification Yes On our single NVIDIA A40 GPU, training 100 iterations takes 1-3 seconds. This work used computational resources, including the NCSA Delta and Delta AI supercomputers through allocations CIS230012 and CIS240370 from the Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support (ACCESS) program, as well as the TACC Frontera supercomputer and Amazon Web Services (AWS) through the National Artificial Intelligence Research Resource (NAIRR) Pilot.
Software Dependencies No Therefore, we build upon the online method Dynamic3DGS (Luiten et al., 2024) as our codebase and integrate our cascaded optimization approach where each frame s Gaussian outputs are generated solely based on the state of Gaussians from the previous frame and the 2D image inputs from the current frame.
Experiment Setup Yes We use fixed empirical weights of [0.19, 0.10, 0.19, 0.48, 0.05] for rigidity, isometry, rotation, scale, and RGB losses, respectively. In our implementation, K = 3. The final numbers of clusters at each layer are 64, 320, 1280. For the 12 scenes across the two datasets, we train for 20,000 iterations to obtain the checkpoints. We fixed the number of training iterations between every two frames to 100 and 2000 for comparison.