reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Flow-based Domain Randomization for Learning and Sequencing Robotic Skills

Authors: Aidan Curtis, Eric Li, Michael Noseworthy, Nishad Gothoskar, Sachin Chitta, Hui Li, Leslie Pack Kaelbling, Nicole E Carey

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show that this architecture is more flexible and provides greater robustness than existing approaches that learn simpler, parameterized sampling distributions, as demonstrated in six simulated and one real-world robotics domain.
Researcher Affiliation	Collaboration	1MIT CSAIL 2Autodesk Research. Correspondence to: Aidan Curtis <EMAIL>.
Pseudocode	Yes	Algorithm 1 Go Flow Algorithm 2 Belief-Space Planner Using BFS
Open Source Code	Yes	A.1. Code Release The codebase for the project can be found here.
Open Datasets	No	The paper uses simulated environments like Cartpole, Ant, Quadcopter, Quadruped, and Humanoid in the Isaac Lab suite of environments (Mittal et al., 2023) and a custom gear insertion task. While Isaac Lab is a known simulation environment, the paper does not explicitly state the use or provision of a publicly available dataset in the traditional sense, but rather generates data dynamically through domain randomization within these environments. No specific dataset is cited with access information.
Dataset Splits	No	The paper describes how evaluation is performed through policy rollouts on 4096 uniformly selected environment initialization samples, and training uses environments sampled from a distribution. However, it does not specify explicit training/test/validation dataset splits, as the nature of domain randomization involves dynamically generated environments rather than fixed datasets.
Hardware Specification	No	The paper mentions the use of a "Franka Emika robot" for real-world experiments. However, it does not provide specific details about the computing hardware (e.g., GPU models, CPU types, memory) used for training the reinforcement learning policies in simulation.
Software Dependencies	No	The paper mentions using the Proximal Policy Optimization (PPO) algorithm, the Zuko normalizing flow library (Rozet et al., 2022), the Isaac Lab suite of environments (Mittal et al., 2023), the Indust Real library built on frankapy toolkit (Tang et al., 2023a; Zhang et al., 2020), and the Bayes3D framework (Gothoskar et al., 2023). However, no specific version numbers are provided for any of these software components.
Experiment Setup	Yes	A.5. Hyperparameters We search over the following values of the α hyperparameter: [0.1, 0.5, 1.0, 1.5, 2.0]. We search over the following values of the β hyperparameters [0.0, 0.1, 0.5, 1.0, 2.0]. Other hyperparameters include number of network updates per training epoch (K = 100), network learning rate (ηϕ = 1e 3), and neural spline flow architecture hyperparameters such as network depth (ℓ= 3), hidden features (64), and number of bins (8).