Generating Synthetic Data for Unsupervised Federated Learning of Cross-Modal Retrieval
Authors: Tianlong Zhang, Zhe Xue, Adnan Mahmood, Junping Du, Yuchen Dong, Shilong Ou, Lang Feng, Ming-Hsuan Yang, Yuankai Qi
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments using four baselines across three datasets demonstrate that our method performs favorably against state-of-the-art methods. Experiments Datasets and Evaluation Metric For evaluating our proposed method, we select three widely-used datasets: MIRFLICKR (Huiskes and Lew 2008), MS COCO (Lin et al. 2014), and NUS-WIDE (Chua et al. 2009). |
| Researcher Affiliation | Academia | 1Beijing University of Posts and Telecommunications 2Macquarie University 3University of California at Merced EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes the methodology in prose and includes a main architecture diagram (Figure 1), but does not contain any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing source code or provide a link to a code repository. |
| Open Datasets | Yes | For evaluating our proposed method, we select three widely-used datasets: MIRFLICKR (Huiskes and Lew 2008), MS COCO (Lin et al. 2014), and NUS-WIDE (Chua et al. 2009). |
| Dataset Splits | No | The paper mentions using a Dirichlet distribution for non-IID data generation and specifies the total number of samples for NUS-WIDE (over 215,000 sample pairs), but does not explicitly provide training, validation, or test dataset splits (e.g., percentages, specific counts, or references to predefined splits). |
| Hardware Specification | Yes | All the experiments are implemented on an RTX A6000 GPU. |
| Software Dependencies | No | The paper mentions using pre-trained CNN-F and Bert, and the Adam optimizer, but does not provide specific version numbers for these or any other software dependencies, nor does it specify the programming language or libraries used with versions. |
| Experiment Setup | Yes | We use the pre-trained CNN-F (Chatfield et al. 2014) to extract each image s 2,048 dimension feature representation and use Bert (Devlin et 2019) to extract the 2,048 dimension feature representation for each text. We apply Adam optimizer with the batch size 64 and the learning rate of 5 10 5. We apply the Dirichlet distribution to obtain non-IID data, with the parameter β controlling the distribution, where β is set to 0.1 default. The number of the prototype is set to {20, 40, 20} for three datasets separately. The communication rounds are set to {50, 100, 70} for three datasets separately. The local training epoch and global training epoch are all set to 30. Every time the generator is updated, the discriminator will update 6 times. The σ1 and σ2 are temperature coefficients set to 0.5. The adopted values for µ is {0.2, 0.3, 0.2} for three datasets separately in our experiments. |