reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Approximated Behavioral Metric-based State Projection for Federated Reinforcement Learning

Authors: Zengxia Guo, Bohui An, Zhongqi Lu

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we evaluate the effectiveness and generalization of Fed RAG using Deep Mind Control Suite (DMC). The DMC is a benchmark for control tasks in continuous action spaces with visual input [Tassa et al., 2018]. We simulated different environments by modifying key physical parameters for several tasks: pole length (cartpole-swing), torso length (cheetah-run), finger distal length (finger-spin), and torso length (walker-walk). As described in the previous section, each client projects state observation to the embedding space by using the approximated behavioral metric-based local state projection network, and updates local SAC network for policy evaluation and improvement.
Researcher Affiliation	Academia	Zengxia Guo1,2 , Bohui An1,2 , Zhongqi Lu1,2 1College of Artificial Intelligence, China University of Petroleum-Beijing, China 2Hainan Institute of China University of Petroleum (Beijing), Sanya, Hainan, China EMAIL, EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1 Fed RAG algorithm 1: Initialize local networks ϕωk, ϕ ωk, Qθk, Q θk, πψk, ˆRξk, ˆPηk for each client k {1, 2, . . . , N}, and global network ϕωG at the server.
Open Source Code	No	The paper does not provide any explicit statements about making code available or links to a code repository.
Open Datasets	Yes	In this section, we evaluate the effectiveness and generalization of Fed RAG using Deep Mind Control Suite (DMC). The DMC is a benchmark for control tasks in continuous action spaces with visual input [Tassa et al., 2018].
Dataset Splits	No	The paper describes the environment interaction settings (e.g., episode length, total steps) for Deep Mind Control Suite, which is an RL environment where data is dynamically generated. It does not provide specific train/test/validation splits for a static dataset.
Hardware Specification	No	The paper does not contain specific hardware details such as GPU models, CPU types, or memory specifications used for running experiments.
Software Dependencies	No	The paper mentions the use of 'neural network approximator' and 'policy networks' but does not specify any software libraries or frameworks with version numbers (e.g., PyTorch, TensorFlow, Python version).
Experiment Setup	Yes	We render 84 84 pixels and stack 3 frames as observation at each time step. We set an episode to consist of 125 environment steps, training over a total of 4000 episodes, which equates to 500,000 steps. For each setting, we evaluate the performance of each clients in both the same and other environments every 16 local update episodes. In the federated learning scenario, every 4 episodes, clients upload their local parameters, which the server then aggregates and redistributes as global parameters.