Federated TD Learning with Linear Function Approximation under Environmental Heterogeneity
Authors: Han Wang, Aritra Mitra, Hamed Hassani, George J. Pappas, James Anderson
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our most significant contribution is to provide the first analysis of a federated RL algorithm, Fed TD(0), that simultaneously accounts for linear function approximation, Markovian sampling, multiple local updates, and heterogeneity. In Theorem 2, we prove that after T communication rounds with K local model-updating steps per round, Fed TD(0) guarantees convergence at a rate of O(1/NKT) to a neighborhood of each agent’s optimal parameter. ... The proof of the above result is deferred to Appendix I. ... Figure 2: Performance of Fed TD(0) under Markovian sampling. (a) Simulations on the effect of the linear speedup (b) Simulations on the effect of the heterogeneity level. |
| Researcher Affiliation | Academia | Han Wang EMAIL Department of Electrical Engineering Columbia University Aritra Mitra EMAIL Department of Electrical and Computer Engineering North Carolina State University Hamed Hassani EMAIL Department of Electrical and Systems Engineering University of Pennsylvania George J. Pappas EMAIL Department of Electrical and Systems Engineering University of Pennsylvania James Anderson EMAIL Department of Electrical Engineering Columbia University |
| Pseudocode | Yes | Algorithm 1 Description of Fed TD(0) |
| Open Source Code | No | The paper does not explicitly state that source code for the methodology described is publicly available. It contains discussions of theoretical analysis and simulations, but no links or explicit statements about code release. |
| Open Datasets | No | The paper does not explicitly use or reference any publicly available datasets. For simulations, it states: "The MDP M(1) of the first agent is randomly generated with a state space of size n = 100. The remaining MDPs are perturbations of M(1) with the heterogeneity levels ϵ = 0.05 and ϵ1 = 0.1." |
| Dataset Splits | No | The paper uses randomly generated MDPs for its simulations, not pre-existing datasets with defined splits. Therefore, no dataset split information (like training/validation/test percentages) is applicable or provided. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its simulations. The acknowledgments mention support by NSF for some authors but no hardware specifics. |
| Software Dependencies | No | The paper does not provide specific software dependency details (e.g., library names with version numbers) needed to replicate the experiments. |
| Experiment Setup | Yes | The number of local steps is chosen as K = 10 in both plots. ... The heterogeneity levels ϵ = 0.05 and ϵ1 = 0.1. ... The number of local steps is chosen as K = 20. ... with the heterogeneity levels ϵ = 0.1 and ϵ1 = 0.1. |