reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Hardware Conditioned Policies for Multi-Robot Transfer Learning

Authors: Tao Chen, Adithyavairavan Murali, Abhinav Gupta

NeurIPS 2018 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our aim is to demonstrate the importance of conditioning the policy based on a hardware representation vh for transferring complicated policies between dissimilar robotic agents. We show performance gains on two diverse settings of manipulation and hopper.
Researcher Affiliation	Academia	Tao Chen The Robotics Institute Carnegie Mellon University Pittsburgh, PA 15213 EMAIL Adithyavairavan Murali The Robotics Institute Carnegie Mellon University Pittsburgh, PA 15213 EMAIL Abhinav Gupta The Robotics Institute Carnegie Mellon University Pittsburgh, PA 15213 EMAIL
Pseudocode	Yes	Algorithm 1 Hardware Conditioned Policies (HCP)
Open Source Code	No	The paper provides a link for videos of experiments but does not provide concrete access to the source code for the methodology described.
Open Datasets	No	The paper describes creating custom robot manipulators and varying their properties within the MuJoCo simulation environment, but it does not provide access information (link, DOI, or citation) for a publicly available dataset.
Dataset Splits	Yes	We performed several leave-one-out experiments (train on 8 robot types, leave 1 robot type untouched) on these robot types.
Hardware Specification	No	The paper mentions running experiments on a 'real Sawyer robot' but does not specify the computing hardware (e.g., CPU, GPU models, memory) used for training models or running simulations.
Software Dependencies	No	The paper mentions using MuJoCo as a physics engine and specific DRL algorithms (PPO, DDPG+HER) but does not provide specific version numbers for software libraries, programming languages, or other ancillary dependencies.
Experiment Setup	Yes	Rewards: We use binary sparse reward setting because sparse reward is more realistic in robotics applications. And we use DPPG+HER as the backbone training algorithm. The agent only gets +1 reward if POI is within ϵ euclidean distance of the desired goal position. Otherwise, it gets 1 reward. We use ϵ = 0.02m in all experiments.