Federated TD Learning with Linear Function Approximation under Environmental Heterogeneity

Authors: Han Wang, Aritra Mitra, Hamed Hassani, George J. Pappas, James Anderson

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our most significant contribution is to provide the first analysis of a federated RL algorithm, Fed TD(0), that simultaneously accounts for linear function approximation, Markovian sampling, multiple local updates, and heterogeneity. In Theorem 2, we prove that after T communication rounds with K local model-updating steps per round, Fed TD(0) guarantees convergence at a rate of O(1/NKT) to a neighborhood of each agent’s optimal parameter. ... The proof of the above result is deferred to Appendix I. ... Figure 2: Performance of Fed TD(0) under Markovian sampling. (a) Simulations on the effect of the linear speedup (b) Simulations on the effect of the heterogeneity level.
Researcher Affiliation Academia Han Wang EMAIL Department of Electrical Engineering Columbia University Aritra Mitra EMAIL Department of Electrical and Computer Engineering North Carolina State University Hamed Hassani EMAIL Department of Electrical and Systems Engineering University of Pennsylvania George J. Pappas EMAIL Department of Electrical and Systems Engineering University of Pennsylvania James Anderson EMAIL Department of Electrical Engineering Columbia University
Pseudocode Yes Algorithm 1 Description of Fed TD(0)
Open Source Code No The paper does not explicitly state that source code for the methodology described is publicly available. It contains discussions of theoretical analysis and simulations, but no links or explicit statements about code release.
Open Datasets No The paper does not explicitly use or reference any publicly available datasets. For simulations, it states: "The MDP M(1) of the first agent is randomly generated with a state space of size n = 100. The remaining MDPs are perturbations of M(1) with the heterogeneity levels ϵ = 0.05 and ϵ1 = 0.1."
Dataset Splits No The paper uses randomly generated MDPs for its simulations, not pre-existing datasets with defined splits. Therefore, no dataset split information (like training/validation/test percentages) is applicable or provided.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its simulations. The acknowledgments mention support by NSF for some authors but no hardware specifics.
Software Dependencies No The paper does not provide specific software dependency details (e.g., library names with version numbers) needed to replicate the experiments.
Experiment Setup Yes The number of local steps is chosen as K = 10 in both plots. ... The heterogeneity levels ϵ = 0.05 and ϵ1 = 0.1. ... The number of local steps is chosen as K = 20. ... with the heterogeneity levels ϵ = 0.1 and ϵ1 = 0.1.