reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Latent Diffusion Planning for Imitation Learning

Authors: Amber Xie, Oleh Rybkin, Dorsa Sadigh, Chelsea Finn

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We focus our experiments on 4 image-based imitation learning tasks: (1) Robomimic Lift, (2) Robomimic Can, (3) Robomimic Square, and (4) ALOHA Sim Transfer Cube. Robomimic (Mandlekar et al., 2021) is a robotic manipulation and imitation benchmark, including the tasks Lift, Can, and Square. We evaluate the success rate out of 50 trials, using the best checkpoint from the last 5 saved checkpoints, with 2 seeds. We create a real world implementation of the Robomimic Lift task, where the task is to pick up a red block from a randomly initialized position. To evaluate our policies, we calculate the success rate across 45 evaluation trials. To thoroughly evaluate performance across the initial state space, we evaluate across a grid of 3x3 points, with 5 attempts per point. We evaluate 3 seeds per method. In Table 1, we examine how action-free data can be used to improve imitation learning policies. In Table 2, we present imitation learning results with suboptimal data. In Table 3, we provide results on a Franka Lift Cube task. In Table 4, we find that LDP outperforms LDP Hierarchical across the 3 Robomimic tasks.
Researcher Affiliation	Academia	Amber Xie 1 Oleh Rybkin 2 Dorsa Sadigh 1 Chelsea Finn 1 1Stanford 2UC Berkeley. Correspondence to: Amber Xie <EMAIL>.
Pseudocode	Yes	Algorithm 1 Inference with Latent Diffusion Planning 1: Input: Encoder E, Planner ϵψ, IDM ϵξ, Planner Diffusion Timesteps Tp, IDM Diffusion Timesteps TIDM, Planning Horizon Hp, Action Horizon Ha 2: Observe initial state s0 and image x0; k = 0 3: while not done do 4: zk (E(xk), sk)
Open Source Code	Yes	1Project Website and Code: https://amberxie88.github.io/ldp/
Open Datasets	Yes	Robomimic (Mandlekar et al., 2021) is a robotic manipulation and imitation benchmark, including the tasks Lift, Can, and Square. The Transfer Cube task is a simulated bimanual ALOHA task, in which one Viper X 6-Do F arm grabs a block and transfers it to the other arm (Zhao et al., 2023). We use the DROID setup (Khazatsky et al., 2024) and teleoperate via the Oculus Quest 2 headset.
Dataset Splits	Yes	For Can and Square, we use 100 out of the 200 demonstrations in the Robomimic datasets; for Lift, we use 3 demonstrations out of the 200 total; and for Transfer Cube, we use 25 demonstrations. Our suboptimal data consists of 500 failed trajectories from an undertrained behavior cloning agent. Our action-free data consists of 100 demonstrations for Lift, Can, and Square from the Robomimic dataset, and 25 demonstrations for Cube.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used for running its experiments or training the models. It mentions the 'Frank Panda 7 degree of freedom robot arm, with a wrist-mounted Zed camera' for the real-world task, but this refers to the robotic system itself rather than the computational hardware used for training.
Software Dependencies	No	The paper mentions software components and frameworks like "Jax reimplementation of the convolutional Diffusion Policy", "Conditional U-Net Architecture", and references
Experiment Setup	Yes	Table 8. Diffusion Policy Architecture Hyperparameters down dims [256, 512, 1024] n diffusion steps 100 batch size 256 lr 1e-4 n grad steps 500k Table 9. IDM Architecture Hyperparameters n blocks 3 n diffusion steps 100 batch size 256 lr 1e-4 n grad steps 500k Table 10. VAE Architecture Hyperparameters block out channels [128, 256, 256, 256, 256, 256] down block types [Down Encoder Block2D] x6 up block types [Up Decoder Block2D] x6 latent channels 4 Latent Dim (2, 2, 4) Lift KL Beta 1e-5 Can KL Beta 1e-6 Square KL Beta 1e-6 ALOHA Cube KL Beta 1e-7 n grad steps 300k