reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Semi-Supervised Online Cross-Modal Hashing

Authors: Xiao Kang, Xingbo Liu, Xuening Zhang, Wen Xue, Xiushan Nie, Yilong Yin

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on benchmark datasets demonstrate the superiority of SSOCH under various scenarios, highlighting the importance of semi-supervised learning for online cross-modal hashing.
Researcher Affiliation	Collaboration	1School of Software, Shandong University, Jinan 250101, China 2School of Computer Science and Technology, Shandong Jianzhu University, Jinan 250101, China 3School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen 518055, China 4Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Co., Ltd, Jinan, China
Pseudocode	No	The paper describes the optimization process in steps (G-Step, V-Step, W-Step, B-Step, R-Step) using mathematical formulations, but it does not present structured pseudocode or an algorithm block.
Open Source Code	No	The paper does not provide any concrete access information, such as a repository link or an explicit statement of code release, for the methodology described.
Open Datasets	Yes	To validate the effectiveness of the proposed SSOCH method, we conduct experiments on three widely-used cross-modal retrieval datasets: IAPR TC-12 (Escalante et al. 2010), NUSWIDE (Chua et al. 2009) and MIRFLICKR (Huiskes and Lew 2008).
Dataset Splits	No	The paper mentions 'We sample 10% instances in each chunk for the supervised baselines to simulate the semi-supervised training situation,' and describes data arriving in a streaming fashion with labeled/unlabeled instances in chunks. However, it does not provide specific training, validation, and test dataset splits (e.g., percentages or sample counts) for the overall datasets used.
Hardware Specification	Yes	All experiments are performed on a computer with an Intel(R) Core(TM) i9-10900K CPU@ 3.70GHz 64GB RAM.
Software Dependencies	No	The paper does not provide specific software dependencies or library versions used for the implementation.
Experiment Setup	Yes	In the implementation, we empirically set ξm = 0.5, δ = 10 3 and γ = 10. Moreover, we set α = 10 4, β = 10 and θ = 10 8 through cross-validation and grid search. The number of iterations T is set to 10. For simplicity, the number of categories in the pseudo-label is fixed as 3. The length of the Hadamard label C is set as 32 for the MIRFLICKR and NUSWIDE datasets, and 256 for the IAPR TC-12 dataset.