Responsive Noise-Relaying Diffusion Policy: Responsive and Efficient Visuomotor Control
Authors: Zhuoqun Chen, Xiu Yuan, Tongzhou Mu, Hao Su
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our approach across a range of benchmarks, focusing on 9 tasks from three well-established datasets: Mani Skill2 (Gu et al., 2023), Mani Skill3 (Tao et al., 2024), and Adroit (Rajeswaran et al., 2017). Our primary evaluation targets 5 tasks involving dynamic object manipulation that demand responsive control. Empirical results show that RNR-DP significantly outperforms Diffusion Policy, delivering much more responsive control. Additionally, we extend our evaluation to tasks that do not require responsive control. The results indicate that, even on these simpler tasks, RNR-DP functions as a superior acceleration method compared to popular alternatives such as DDIM (Song et al., 2020) and Consistency Policy (Song et al., 2023; Prasad et al., 2024). Overall, our evaluations systematically demonstrate that RNR-DP provides both highly responsive and efficient control. |
| Researcher Affiliation | Academia | Zhuoqun Chen EMAIL UC San Diego Xiu Yuan EMAIL UC San Diego Tongzhou Mu EMAIL UC San Diego Hao Su EMAIL UC San Diego |
| Pseudocode | Yes | Pseudocode is provided in Algorithm 1. (Section 5.1) Algorithm 1 Noise-relaying Diffusion Policy Inference (Appendix B.1) Algorithm 2 Responsive Noise-Relaying Diffusion Policy Training (Appendix B.2) |
| Open Source Code | No | Our project page is available at https://rnr-dp.github.io/. |
| Open Datasets | Yes | We evaluate our approach across a range of benchmarks, focusing on 9 tasks from three well-established datasets: Mani Skill2 (Gu et al., 2023), Mani Skill3 (Tao et al., 2024), and Adroit (Rajeswaran et al., 2017). ... Mani Skill2 and Mani Skill3 demonstrations are provided in Gu et al. (2023) and Tao et al. (2024), and Adroit demonstrations are provided in Rajeswaran et al. (2017). |
| Dataset Splits | No | The paper mentions 'Traj Num for Training' in Table 9, indicating the number of trajectories used for training, but does not provide explicit training/test/validation splits (e.g., percentages or specific counts for each split) within the text. |
| Hardware Specification | Yes | We train our models and baselines with cluster assigned GPUs (NVIDIA 2080Ti & A10). |
| Software Dependencies | No | The paper mentions software components like 'UNet-based architecture', 'Adam W optimizer', 'Cosine EMA', but does not provide specific version numbers for any libraries or frameworks. |
| Experiment Setup | Yes | We summarize the key hyperparameters of RNR-DP in Table 10. The observation horizon To and noise-relaying buffer capacity f for each task is listed in Table 11. ... We use Adam W optimizer with an initial learning rate of 1e-4, applying 500 warmup steps followed by cosine decay. We use batch size of 1024 for state policies and 256 for visual policies for both Mani Skill and Adroit benchmarks. We evaluate DP, CP and RNR-DP model checkpoints using EMA weights every 10K training iterations for Mani Skill tasks and every 5K for Adroit tasks. |