OmniPhysGS: 3D Constitutive Gaussians for General Physics-Based Dynamics Generation

Authors: Yuchen Lin, Chenguo Lin, Jianjin Xu, Yadong MU

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4 EXPERIMENTS 4.1 EXPERIMENTAL SETTINGS 4.2 RESULTS AND COMPARISONS 4.3 ABLATION STUDY We compare our method with three state-of-the-art 3D physical-plausible dynamic synthesis methods: (1) Phys Dreamer (Zhang et al., 2024b), which utilizes generated video from diffusion models to supervise Young s modulus field for 3D objects; (2) Physics3D (Liu et al., 2024) and (3) Dream Physics (Huang et al., 2024), which adopt Score Distillation Sampling to optimize Young s modulus, Poisson s ratio, and damping coefficient. Evaluation Metrics Since there is no ground-truth data for dynamic 3D scenes, we propose to evaluate the performance of our method in two aspects. First, we measure the alignment between generated videos and text prompts using CLIPSIM (Wu et al., 2021). CLIPSIM is derived from the average cosine similarity between the text embedding and each frame embedding. Second, Diff SSIM and Diff CLIP are proposed to evaluate the expressiveness and robustness of our method by measuring the difference between the video generated by a trained model and a randomly initialized model with SSIM and CLIPSIM, respectively. Higher CLIPSIM, Diff SSIM, and Diff CLIP indicate better performance.
Researcher Affiliation Academia Yuchen Lin1, Chenguo Lin 1, Jianjin Xu2, Yadong Mu 1 1Peking University, 2Carnegie Mellon University
Pseudocode Yes In the Material Point Method (see Algorithm 1 in Appendix A.2), the only factors that determine the material properties of each particle are constitutive models Ψ( ) and ψ( ) and the physical parameters γ RK, in functions F2Stress and Return Map, with the rest of the MPM pipeline being independent of the material properties. Algorithm 1 A Moving Least Squares Material Point Method (MLS-MPM) Step
Open Source Code No Our implementation will be released to the public for future research. Our code and data will be released after curation.
Open Datasets Yes Datasets We collect several static 3DGS scenes from the public dataset of Phys Gaussian (Xie et al., 2024). We also utilize Blender Nerf (Raafat, 2024) to generate more scenes with different materials and objects.
Dataset Splits No The paper mentions collecting data and generating scenes but does not provide specific percentages, sample counts, or methodologies for splitting the data into training, validation, or test sets for its own experiments.
Hardware Specification Yes Training Details All experiments are conducted on a single NVIDIA A6000 (48GB) GPU.
Software Dependencies No The paper mentions using 'warp (Macklin, 2022)', 'Adam optimizer (Kingma, 2014)', the 'official implementation of Gaussian Splatting renderer (Kerbl et al., 2023)' and 'Model Scope (Wang et al., 2023)'. However, it does not provide specific version numbers for software dependencies such as Python, PyTorch, or CUDA, which are necessary for full reproducibility.
Experiment Setup Yes Simulating Details We set the simulating timestep t to 3 10 4 for all experiments. In training, the simulated states are sampled every 10 steps and rendered into images with a fixed camera. The video length is set to 150 frames with a frame rate of 30 fps, resulting in a total of 5 seconds. We load all Gaussian particles in a 1 1 1 bounding box, which is divided into 25 25 25 grids. The simulating world is equipped with a gravity of 9.8 m/s2. Training Details All experiments are conducted on a single NVIDIA A6000 (48GB) GPU. We train our model using the Adam optimizer (Kingma, 2014) with a learning rate of 5 10 5 for 5 iterations of each scene. In each iteration, the whole simulating process is divided into 10 stages sequentially where 15 frames are rendered for each stage. Video clips of each stage are optimized by a pretrained text-to-video diffusion model (Wang et al., 2023) for 30 iterations before we proceed to the next stage. For the learnable Constitutive Gaussians, we first adopt FPS to sample 8192 centers from the original Gaussian particles and then use a convolutional layer to encode the features within a neighborhood and a simple MLP to predict the category of the material. Each neighborhood includes 32 particles.