reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

MaskPrompt: Open-Vocabulary Affordance Segmentation with Object Shape Mask Prompts

Authors: Dongpan Chen, Dehui Kong, Jinghua Li, Baocai Yin

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Quantitative and qualitative evaluations compared with state-of-the-art methods demonstrate that the proposed method achieves superior performance on the proposed benchmark dataset and other open-vocabulary part segmentation datasets. We conduct extensive experiments on the benchmark and other object part segmentation datasets, which demonstrates the effectiveness of our proposed method.
Researcher Affiliation	Academia	Dongpan Chen, Dehui Kong , Jinghua Li, Baocai Yin School of Information Science and Technology, Beijing University of Technology, Beijing, China EMAIL, EMAIL
Pseudocode	No	The paper describes the architecture and methodology in detail, outlining the steps and components, but does not include any formal pseudocode blocks or algorithms.
Open Source Code	No	The paper does not contain any explicit statement about releasing source code, nor does it provide a link to a code repository.
Open Datasets	Yes	We combine existing affordance segmentation dataset IITAFF (Nguyen et al. 2017) and part segmentation dataset Pascal-Part-108 (Michieli et al. 2020), and re-annotate labels according to the affordances of target entities (including objects, humans and animals) to construct an open-vocabulary affordance segmentation dataset, namely OVAS25. ... We also evaluate the proposed model on another affordance segmentation dataset UMD (Myers et al. 2015) and other part segmentation datasets, i.e., Pascal-Part-58 (Chen et al. 2014), Pascal-Part-116 (Wei et al. 2024), Pascal-Part-201 (Singh et al. 2022), and ADE20K-Part-234 (Wei et al. 2024).
Dataset Splits	Yes	OVAS-25 has 28 entity classes and 25 affordance classes (as shown in Fig. 1), totalling 18938 images, of which 11363 are used for training and 7575 for testing.
Hardware Specification	Yes	All experiments are conducted on a NVIDIA A800 80GB GPU.
Software Dependencies	No	The paper mentions using specific tools like DETR, SAM, Alpha-CLIP, CLIP's textencoder, and Mask Former, but does not provide specific version numbers for any of these software components or other libraries.
Experiment Setup	Yes	We train the whole model for 120K iterations with a learning rate of 10 4 decreased by 10 times at 60K and 100K iterations. We optimize the network by Adam W with the weight decay 10 4 and batch size 32. The layers of pixel decoder L is 6. In each layer, the embedding dimension is 768, the head number of the multihead attention is 12, d is 512, and the hidden dimension of the feed-forward network is 3072. For the dimensions of text and vision features, dt, dv, dvt, and dcls are all 512.