reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Efficient Heterogeneity-Aware Federated Active Data Selection

Authors: Ying-Peng Tang, Chao Ren, Xiaoli Tang, Sheng-Jun Huang, Lizhen Cui, Han Yu

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on 11 benchmark datasets demonstrate significant improvements of FALE over existing state-of-the-art methods. ... We plot the mean learning curves over 10 runs for the compared methods in Fig. 2. The mean and standard deviation of MSE on the test set are reported.
Researcher Affiliation	Academia	1College of Computing and Data Science, Nanyang Technological University, Singapore 2School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Sweden 3College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, China 4School of Software, Shandong University, Jinan, China. Correspondence to: Han Yu <EMAIL>.
Pseudocode	Yes	Algorithm 1 The FALE Algorithm ... Algorithm 2 FALE-local Algorithm
Open Source Code	No	The paper states: "We implement the regression model using Py Torch... The implementation of LOGO (Kim et al., 2023) is sourced from the authors. The FL framework is built upon the Fed Lab (Dun Zeng & Xu, 2021) toolbox." However, it does not provide an explicit link or statement for the availability of the source code for the FALE methodology described in this paper.
Open Datasets	Yes	We employ 9 UCI (Dua & Graff, 2017) and Open ML (Bischl et al., 2021) regression benchmarks in our experiments. ... Celeb A is a facial image dataset ... Following Lyu et al. (2025)... IMDB-WIKI (Rothe et al., 2018) is a facial image dataset with age annotations. We adopt the dataset settings from Yang et al. (2021)...
Dataset Splits	Yes	For each dataset, we uniformly sample 20% of the instances for testing, while the remaining instances are distributed across k = 10 clients in a non-i.i.d. manner... To simulate the non-i.i.d. setting in regression task, we perform binning on the regression target vector, with the number of bins equal to the number of clients. We then adopt the Dirichlet distribution strategy (Yurochkin et al., 2019) with a Dirichlet alpha of 5 to assign instances to clients using the bins as a class label. ... For each dataset, 5% of each client s data is uniformly sampled to form the initial labeled set.
Hardware Specification	No	The paper does not explicitly describe the hardware used to run its experiments. It mentions using Py Torch for implementation and Fed Lab for the FL framework, but no specific details about CPUs, GPUs, or other computational resources are provided.
Software Dependencies	No	The paper states: "We implement the regression model using Py Torch... The FL framework is built upon the Fed Lab (Dun Zeng & Xu, 2021) toolbox." It mentions PyTorch and FedLab but does not provide specific version numbers for these or any other key software components.
Experiment Setup	Yes	The model is trained by optimizing the mean squared error (MSE) with the Adam optimizer, using a learning rate of 0.01 for 25 epochs. For local active regression methods, we utilize the implementation provided by Holzm uller et al. (2023). ... At each iteration, we allocate a query budget of 5 instances per client, resulting in a total of 50 instances queried per round.