SafeDiffuser: Safe Planning with Diffusion Probabilistic Models

Authors: Wei Xiao, Johnson (Tsun-Hsuan) Wang, Chuang Gan, Ramin Hasani, Mathias Lechner, Daniela Rus

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We set up experiments to answer the following questions: Does our method match the theoretical potential in various tasks quantitatively and qualitatively? How does our method compare with state-of-the-art approaches in enforcing safety specifications? How does our proposed method affect the performance of diffusion under guaranteed specifications? We focus on three experiments from D4RL (Farama-foundation): maze (maze2d-large-v1), gym robots (Walker2d-v2 and Hopper-v2), and manipulation. The training data is publicly available, see Janner et al. (2022). The experiment details and metrics used are shown in Appendix. The safe diffusers generate both planning trajectory and control for the robots, and the score/reward is based on closed-loop control.
Researcher Affiliation Collaboration 1 Computer Science and Artificial Intelligence Lab, MIT, Cambridge, MA 02139, USA 2 UMass Amherst and MIT-IBM Watson AI Lab, USA EMAIL
Pseudocode Yes Algorithm 1 Enforcing invariance in diffusion models within a diffusion step
Open Source Code Yes 1Videos and code are available at: https://safediffuser.github.io/safediffuser/
Open Datasets Yes We focus on three experiments from D4RL (Farama-foundation): maze (maze2d-large-v1), gym robots (Walker2d-v2 and Hopper-v2), and manipulation. The training data is publicly available, see Janner et al. (2022).
Dataset Splits Yes The training data is publicly available from Janner et al. (2022). The diffusion model structure is the same as the open source one (Maze2D-large-v1) provided in Janner et al. (2022). We set the planning horizon as 384, the diffusion steps as 256 for the proposed methods. The learning rate is 2e 4 with 2e6 training steps.
Hardware Specification Yes The training of the model takes about 10 hours on a Nvidia RTX-3090 GPU.
Software Dependencies No The paper does not explicitly mention specific versions of software dependencies such as Python, PyTorch, or CUDA versions. It refers to environments like Mu Jo Co and Pybullet but without version numbers.
Experiment Setup Yes We set the planning horizon as 384, the diffusion steps as 256 for the proposed methods. The learning rate is 2e 4 with 2e6 training steps.