reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation

Authors: Mohamed el amine Boudjoghra, Angela Dai, Jean Lahoud, Hisham Cholakkal, Rao Anwer, Salman Khan, Fahad Khan

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate our Open-YOLO 3D on two benchmarks, Scan Net200 and Replica, under two scenarios: (i) with ground truth masks, where labels are required for given object proposals, and (ii) with class-agnostic 3D proposals generated from a 3D proposal network. Our Open-YOLO 3D achieves state-of-the-art performance on both datasets while obtaining up to 16 speedup compared to the best existing method in literature. On Scan Net200 val. set, our Open-YOLO 3D achieves mean average precision (m AP) of 24.7% while operating at 22 seconds per scene. github.com/aminebdj/Open YOLO3D
Researcher Affiliation	Academia	Mohamed El Amine Boudjoghra TUM, MBZUAI Angela Dai TUM Jean Lahoud MBZUAI Hisham Cholakkal MBZUAI Rao Muhammad Anwer MBZUAI, Aalto University Salman Khan MBZUAI, ANU Fahad Shahbaz Khan MBZUAI, Linköping University
Pseudocode	No	The paper describes the methodology using text and diagrams (e.g., Figure 2 for the overall pipeline) but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	Yes	On Scan Net200 val. set, our Open-YOLO 3D achieves mean average precision (m AP) of 24.7% while operating at 22 seconds per scene. github.com/aminebdj/Open YOLO3D
Open Datasets	Yes	We conduct our experiments using the Scan Net200 Rozenberszki et al. (2022) and Replica Straub et al. (2019) datasets.
Dataset Splits	Yes	Our analysis on Scan Net200 is based on its validation set, comprising 312 scenes. For the 3D instance segmentation task, we utilize the 200 predefined categories from the Scan Net200 annotations. ... We use RGB-depth pairs from the Scan Net200 and Replica datasets, processing every 10th frame for Scan Net200 and all frames for Replica, maintaining the same settings as Open Mask3D for fair comparison.
Hardware Specification	Yes	We use a single NVIDIA A100 40GB GPU for all experiments.
Software Dependencies	No	The paper mentions several models and frameworks like YOLO-World Cheng et al. (2024), Mask3D Schult et al. (2023), SAM Kirillov et al. (2023), and CLIP Zhang et al. (2023) but does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup	No	The paper states: 'We use RGB-depth pairs from the Scan Net200 and Replica datasets, processing every 10th frame for Scan Net200 and all frames for Replica, maintaining the same settings as Open Mask3D for fair comparison.' and 'To create LG label maps, we use the YOLO-World Cheng et al. (2024) extra-large model for its real-time capability and high zero-shot performance.' While it mentions specific models and some processing choices, it lacks concrete numerical hyperparameters (e.g., learning rate, batch size, number of epochs, optimizer settings) within the main text required for full reproducibility.