DreamPhysics: Learning Physics-Based 3D Dynamics with Video Diffusion Priors
Authors: Tianyu Huang, Haoze Zhang, Yihan Zeng, Zhilu Zhang, Hui Li, Wangmeng Zuo, Rynson W. H. Lau
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that our method enjoys more realistic motions than state-of-the-arts do. Extensive experiments show that our results enjoy more realistic motion simulation. Extensive ablation studies are then conducted to demonstrate the effectiveness of our newly proposed components. |
| Researcher Affiliation | Collaboration | 1 Harbin Institute of Technology 2 City University of Hong Kong 3 Huawei Noah s Ark Lab |
| Pseudocode | No | The paper describes methods using equations and textual explanations but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code https://github.com/tyhuang0428/Dream Physics |
| Open Datasets | Yes | Dataset. We collect seven 3D static scenes or objects from previous works (Xie et al. 2023; Zhang et al. 2024) and 3D GS generative models (Tang et al. 2024). |
| Dataset Splits | No | The paper mentions collecting 3D static scenes and using VBench for evaluation, but it does not specify any training/test/validation splits or other dataset partitioning strategies for reproducibility. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU models, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions software components like 'warp', 'Model Scope', 'Stable Video Diffusion', 'KAN-based', and 'LAION aesthetic predictor' but does not specify their version numbers or the versions of general programming languages or libraries (e.g., Python, PyTorch, CUDA). |
| Experiment Setup | Yes | For most simulation scenes, we set the simulation duration as 5 10 5 second and the frame duration as 4 10 2 second. Thus, we simulate 800 steps between every two renderings and include the simulation gradient of the last step in the optimization. The numbers of their generated video frames T are 16 and 25, respectively. For frame boosting, we set M = 5, boosting the video slices to 5 groups. The setting of MDS follows SDS, where CFG value is set to 100. We stop the training if optimized parameter values stabilize within one order of magnitude. The training process requires around 30 iterations. |