GFlow: Recovering 4D World from Monocular Video
Authors: Shizun Wang, Xingyi Yang, Qiuhong Shen, Zhenxiang Jiang, Xinchao Wang
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on a challenging video dataset, DAVIS (Perazzi et al. 2016; Pont Tuset et al. 2017) dataset, which contains real-world videos of about 30 100 frames with various scenarios and motion dynamics. We report the reconstruction quality results on 30 DAVIS 2017 evaluation videos. For quantitative evaluation of reconstruction quality, we report standard PSNR, SSIM and LPIPS (Zhang et al. 2018) metrics. Additionally, the paper includes sections like "Implementation Details", "Evaluation of Reconstruction Quality", "Quantitative Results", "Qualitative Results", and "Ablation Study", all indicating empirical studies and data analysis. |
| Researcher Affiliation | Academia | The authors are affiliated with "National University of Singapore" and use email domains such as "u.nus.edu" and "nus.edu.sg", which are indicative of an academic institution. |
| Pseudocode | No | The paper describes its methodology and overall pipeline in Section 4 'Methodology' and Section 4.3 'Overall Pipeline' using descriptive text and a high-level diagram (Figure 2). It does not contain any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps formatted like code. |
| Open Source Code | Yes | The paper provides a website link: "Website https://littlepure2333.github.io/GFlow". This URL points to a GitHub repository, which is a concrete source for the code. |
| Open Datasets | Yes | The paper states: "We conduct experiments on a challenging video dataset, DAVIS (Perazzi et al. 2016; Pont Tuset et al. 2017) dataset". The DAVIS dataset is a well-known public dataset, and the paper provides formal citations for it. |
| Dataset Splits | No | The paper mentions using "30 DAVIS 2017 evaluation videos" for reporting reconstruction quality results. However, it does not provide specific details on the training, validation, and test splits (e.g., exact percentages or sample counts) used for their experiments, nor does it explicitly reference a standard split they followed for their entire methodology. |
| Hardware Specification | Yes | The paper explicitly states the hardware used: "All experiments are conducted on a single NVIDIA RTX A5000 GPU." |
| Software Dependencies | No | The paper mentions using specific tools and models like MASt3R, Uni Match, and the Adam optimizer. However, it does not provide specific version numbers for these or other underlying software components (e.g., Python, PyTorch, CUDA libraries) required for exact reproduction. |
| Experiment Setup | Yes | The paper provides detailed experimental setup information, including: "The initial number of Gaussian points is set to 50,000. The learning rate for Gaussian optimization is set to 4e-3, and for camera optimization, it is set to 1e-3. The Adam optimizer is used with 500 iterations for Gaussian optimization in the first frame, 150 iterations for camera optimization, and 300 iterations for Gaussian optimization in subsequent frames. The gradient of color is set to zero... We balance the loss term by setting λp = 1, λd = 0.1, λf = 0.01, and λi = 50. Densification is conducted at the 150 and 300 steps in the first frame optimization. For subsequent frames, the densification occurs at the first step with a new content mask applied, and also occurs at the 100 and 200 steps with error-thresholding mask applied. The error threshold in densification is set to 0.01." |