reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learn How to Query from Unlabeled Data Streams in Federated Learning

Authors: Yuchang Sun, Xinran Li, Tao Lin, Jun Zhang

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive simulations on image and text tasks show that Lea DQ advances the model performance in various FL scenarios, outperforming the benchmarking algorithms. [...] We conduct simulations to compare the Lea DQ algorithm with several representative data querying strategies. The experimental results on various image and text tasks demonstrate that Lea DQ selects samples that result in more meaningful model updates, leading to improved model accuracy.
Researcher Affiliation	Academia	1Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology 2Westlake University EMAIL, EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1: The Lea DQ framework
Open Source Code	Yes	1The code is available at https://github.com/hiyuchang/leadq/.
Open Datasets	Yes	We evaluate the algorithms on two image classification tasks, i.e., SVHN (Netzer et al. 2011) and CIFAR-100 (Krizhevsky and Hinton 2009), and one text classification task, i.e., 20Newsgroup (Lang 1995).
Dataset Splits	No	The paper discusses allocating training data to clients and computing model accuracy on test data, but does not provide specific percentages, counts, or references to predefined train/validation/test splits for the overall datasets. For example, it says 'The model accuracy is computed on the test data.' but not how the test data is defined or split from the total.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used for running the experiments.
Software Dependencies	No	The paper mentions models like 'Res Net-18' and 'Distil BERT' and algorithms like 'Fed Avg' and 'QMIX', but does not list any specific software libraries or tools with their version numbers.
Experiment Setup	Yes	We simulate an FL system with one server and K = 10 clients. [...] In each round, Nu = 10 unlabeled data samples arrive at each client independently and each client selects Nq = 1 data sample for label querying. [...] To simulate the non-IID setting, we allocate the training data to clients according to the Dirichlet distribution with concentration parameter α = 0.5 (Li et al. 2022). [...] The results are illustrated in Tables 3 and 4, respectively, in which we show the model accuracy after R = 500 rounds when applying different data querying strategies.