reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Q-MiniSAM2: A Quantization-based Benchmark for Resource-Efficient Video Segmentation

Authors: Xuanxuan Ren, Xiangyu Li, Kun Wei, Xu Yang, Yanhua Yang

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 Experiments
Researcher Affiliation	Academia	Xidian University EMAIL, EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1 Post-Training Quantization
Open Source Code	No	Explanation: The paper does not explicitly state that its source code is available or provide a link to a repository for the methodology described.
Open Datasets	Yes	We conduct experiments on two object segmentation datasets: MS-COCO [Lin et al., 2014] and SA-V [Ravi et al., 2024].
Dataset Splits	Yes	MS-COCO contains 123,000 images across 91 object categories, of which the training set contains 118,000 images and the validation set containing 5,000 images. The SA-V dataset comprises approximately 51,000 real-world videos and over 600,000 spatiotemporal masks (referred to as masklets), establishing it as the largest video segmentation dataset to date. Specifically, the training split consists of 505,83 videos and 642,036 masklets, while the validation split includes 155 videos and 293 masklets. Additionally, the test split contains 150 videos and 278 masklets.
Hardware Specification	No	Explanation: The paper mentions 'On specialized hardware' in the context of theoretical speedup, but it does not provide specific details on the GPU/CPU models or other hardware used for running experiments.
Software Dependencies	No	Explanation: The paper mentions 'YOLOX [Ge, 2021]' as a detector but does not provide specific version numbers for any software or libraries used.
Experiment Setup	Yes	For quantization training, a set of 32 unannotated training images is randomly selected to form the training dataset. In the prompt-based visual segmentation task, to obtain accurate target masks through manually annotated box prompts, 8 videos are randomly chosen from the SA-V validation set, with 20 frames extracted from each video to construct the training dataset. Following conventional methodologies, the implemented quantization strategy includes per-channel asymmetric quantization for weights and per-tensor asymmetric quantization for activation values. Each module undergoes 20,000 iterations during the reconstruction phase. Additionally, to ensure the stability and robustness of the model s performance, the first and last layers (or modules) of the network are exempted from the quantization process. The hyperparameters α, β, and γ are set to 1, 0.5 and 0.4 respectively.