reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Free-Moving Object Reconstruction and Pose Estimation with Virtual Camera

Authors: Haixin Shi, Yinlin Hu, Daniel Koguciuk, Juan-Ting Lin, Mathieu Salzmann, David Ferstl

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our method on the standard HO3D dataset and a collection of egocentric RGB sequences captured with a head-mounted device. We demonstrate that our approach outperforms most methods significantly, and is on par with recent techniques that assume prior information. [...] 4 Experiments We evaluate our method systematically in this section. Datasets. We first evaluate our method on the standard HO3D dataset (Hampali et al. 2020), which includes video captures of daily objects with a fixed camera.
Researcher Affiliation	Collaboration	Haixin Shi1, Yinlin Hu2, Daniel Koguciuk2, Juan-Ting Lin2 Mathieu Salzmann1, David Ferstl2 1EPFL 2Magic Leap
Pseudocode	No	The paper describes the approach using text descriptions, mathematical equations, and figures, but does not include any explicit pseudocode or algorithm blocks.
Open Source Code	Yes	Project Page https://haixinshi.github.io/fmov/
Open Datasets	Yes	We evaluate our method on both the HO3D dataset (Hampali et al. 2020) with fixed camera and a collection of data captured using a head-mounted AR device with egocentric views.
Dataset Splits	Yes	We report results on the 9 sequences of HO3D as in (Hampali et al. 2023; Ye, Gupta, and Tulsiani 2022)
Hardware Specification	Yes	On a typical NVIDIA V100 GPU, the training of a 100-frame sequence takes about 3 hours for initialization and 7 hours for refinement.
Software Dependencies	No	The paper mentions using specific models and optimizers like 'ADAM optimizer (Kingma and Ba 2015)' and 'Lo FTR (Sun et al. 2021)', but does not provide specific version numbers for software libraries, programming languages, or other ancillary tools used in the implementation.
Experiment Setup	Yes	During training, the learning rate warms up linearly from 0 to 5e-4 during the first 5k iteration and then follows a cosine decay schedule with alpha=0.05. For Pose MLP, we use another ADAM optimizer with a cosine decay schedule of alpha=0.5. We randomly sample 512 rays from the input image batch. During the optimization with guided virtual camera, we only sample 32 points along each ray for efficiency. We progressively train our model with B consecutive images as a group. For every group, we train the networks with a fixed number of training steps (typically 1K). We sample 20% of the rays from images within previously-converged groups and 80% from the images within the newly added group. We train the networks for 150K training steps for refinement.