reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Benchmarking Unsupervised Object Representations for Video Sequences

Authors: Marissa A. Weis, Kashyap Chitta, Yash Sharma, Wieland Brendel, Matthias Bethge, Andreas Geiger, Alexander S. Ecker

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To close this gap, we design a benchmark with four data sets of varying complexity and seven additional test sets featuring challenging tracking scenarios relevant for natural videos. Using this benchmark, we compare the perceptual abilities of four object-centric approaches... Our results suggest that the architectures with unconstrained latent representations learn more powerful representations in terms of object detection, segmentation and tracking than the spatial transformer based architectures. We also observe that none of the methods are able to gracefully handle the most challenging tracking scenarios despite their synthetic nature, suggesting that our benchmark may provide fruitful guidance towards learning more robust object-centric video representations.
Researcher Affiliation	Academia	1Institute of Computer Science, University of Göttingen, Germany 2Campus Institute Data Science, Göttingen, Germany 3Department of Computer Science, University of Tübingen, Germany 4Institute for Theoretical Physics, University of Tübingen, Germany 5Bernstein Center for Computational Neuroscience, Tübingen, Germany 6Max Planck Institute for Intelligent Systems, Tübingen, Germany 7Max Planck Institute for Dynamics and Self-Organization, Göttingen, Germany
Pseudocode	No	The paper describes the methods (MONet, Vi MON, TBA, IODINE, OP3, SCALOR) using prose and mathematical equations in Section C 'Methods' and its subsections. It does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Our code, data, as well as a public leaderboard of results is available at https:// eckerlab.org/code/weis2021/.
Open Datasets	Yes	Our code, data, as well as a public leaderboard of results is available at https:// eckerlab.org/code/weis2021/. ... Data sets are available at this URL.
Dataset Splits	Yes	The training set consists of 10,000 examples whereas the validation set as well as the test set contain 1,000 examples each. ... We generate a training set consisting of 9600 examples, validation set of 384 samples and test set of 1,000 examples ... The training set consists of 10,000 sequences whereas the validation set and the test set contain 1,000 sequences each.
Hardware Specification	Yes	Runtime analysis (using a single RTX 2080 Ti GPU).
Software Dependencies	No	MONet and Vi MON are implemented in Py Torch (Paszke et al., 2019)... k-Means algorithm as implemented by sklearn (Pedregosa et al., 2011). The paper mentions software such as PyTorch and sklearn, but does not provide specific version numbers for these or other key libraries.
Experiment Setup	Yes	MONet and Vi MON are implemented in Py Torch (Paszke et al., 2019) and trained with the Adam optimizer (Kingma and Ba, 2015) with a batch size of 64 for MONet and 32 for Vi MON, using an initial learning rate of 0.0001. ... MONet is trained with β = 0.5 and γ = 1 and Vi MON is trained with β = 1 and γ = 2. K = 5 for Sp MOT, K = 6 for VMDS and K = 8 for VOR. ... We train SCALOR with a batch size of 16 for 300 epochs using a learning rate of 0.0001 for Sp MOT and VOR and for 400 epochs for VMDS.