reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Generating Synthetic Data for Unsupervised Federated Learning of Cross-Modal Retrieval

Authors: Tianlong Zhang, Zhe Xue, Adnan Mahmood, Junping Du, Yuchen Dong, Shilong Ou, Lang Feng, Ming-Hsuan Yang, Yuankai Qi

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments using four baselines across three datasets demonstrate that our method performs favorably against state-of-the-art methods. Experiments Datasets and Evaluation Metric For evaluating our proposed method, we select three widely-used datasets: MIRFLICKR (Huiskes and Lew 2008), MS COCO (Lin et al. 2014), and NUS-WIDE (Chua et al. 2009).
Researcher Affiliation	Academia	1Beijing University of Posts and Telecommunications 2Macquarie University 3University of California at Merced EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes the methodology in prose and includes a main architecture diagram (Figure 1), but does not contain any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain an explicit statement about releasing source code or provide a link to a code repository.
Open Datasets	Yes	For evaluating our proposed method, we select three widely-used datasets: MIRFLICKR (Huiskes and Lew 2008), MS COCO (Lin et al. 2014), and NUS-WIDE (Chua et al. 2009).
Dataset Splits	No	The paper mentions using a Dirichlet distribution for non-IID data generation and specifies the total number of samples for NUS-WIDE (over 215,000 sample pairs), but does not explicitly provide training, validation, or test dataset splits (e.g., percentages, specific counts, or references to predefined splits).
Hardware Specification	Yes	All the experiments are implemented on an RTX A6000 GPU.
Software Dependencies	No	The paper mentions using pre-trained CNN-F and Bert, and the Adam optimizer, but does not provide specific version numbers for these or any other software dependencies, nor does it specify the programming language or libraries used with versions.
Experiment Setup	Yes	We use the pre-trained CNN-F (Chatfield et al. 2014) to extract each image s 2,048 dimension feature representation and use Bert (Devlin et 2019) to extract the 2,048 dimension feature representation for each text. We apply Adam optimizer with the batch size 64 and the learning rate of 5 10 5. We apply the Dirichlet distribution to obtain non-IID data, with the parameter β controlling the distribution, where β is set to 0.1 default. The number of the prototype is set to {20, 40, 20} for three datasets separately. The communication rounds are set to {50, 100, 70} for three datasets separately. The local training epoch and global training epoch are all set to 30. Every time the generator is updated, the discriminator will update 6 times. The σ1 and σ2 are temperature coefficients set to 0.5. The adopted values for µ is {0.2, 0.3, 0.2} for three datasets separately in our experiments.