reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

DDPA-3DVG: Vision-Language Dual-Decoupling and Progressive Alignment for 3D Visual Grounding

Authors: Hongjie Gu, Jinlong Fan, Liang Zheng, Jing Zhang, Yuxiang Yang

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on three challenging benchmarks, Scan Refer, Nr3D, and Sr3D, demonstrate that our method achieves state-of-the-art performance, validating its effectiveness in 3D visual grounding.
Researcher Affiliation	Academia	Hongjie Gu1, Jinlong Fan1 , Liang Zheng1, Jing Zhang2, Yuxiang Yang1 1School of Electronics and Information, Hangzhou Dianzi University, China 2School of Computer Science, Wuhan University, China EMAIL, EMAIL
Pseudocode	No	The paper describes the methodology using textual explanations and diagrams (e.g., Figure 2: Overview of the Proposed DDPA-3DVG), but it does not contain any explicit pseudocode blocks or algorithm listings.
Open Source Code	Yes	The code will be released at https://github.com/HDU-VRLab/DDPA-3DVG.
Open Datasets	Yes	We evaluated the effectiveness of DDPA-3DVG using three widely adopted and challenging datasets: Scan Refer [Chen et al., 2020], Sr3D and Nr3D [Achlioptas et al., 2020]. Scan Refer is a 3D visual grounding dataset constructed upon 800 scenes from Scan Net [Dai et al., 2017].
Dataset Splits	No	The paper mentions evaluating on the 'Scan Refer validation set' and that each scene is 'categorized as easy or hard' based on object instances. It also states 'Unique( 19%) Multiple( 81%)' for Scan Refer, referring to object characteristics. However, it does not explicitly provide the specific training/test/validation split percentages, sample counts, or a detailed splitting methodology used for their experiments.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies	No	The paper mentions employing "a pre-trained RoBERTa [Liu et al., 2019] model" and "Point Net++ [Qi et al., 2017]", as well as "existing NLP tools [Schuster et al., 2015; Wu et al., 2019]". However, it does not list specific software libraries or frameworks with their version numbers that are necessary to replicate the experiments (e.g., PyTorch 1.x, Python 3.x).
Experiment Setup	No	The paper discusses the progressive alignment module, prediction head, and alignment losses, as well as the convergence rate compared to another method (Figure 5, mentioning 'the number of epochs required for our method to reach a performance of 52.7% is 37 fewer than that of EDA'). However, it does not explicitly provide specific hyperparameter values (e.g., learning rate, batch size, specific optimizer settings) or other detailed system-level training configurations for their proposed method.