BaB-ND: Long-Horizon Motion Planning with Branch-and-Bound and Neural Dynamics

Authors: Keyi Shen, Jiangwei Yu, Jose Barreiros, Huan Zhang, Yunzhu Li

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our framework achieves superior planning performance, generating high-quality state-action trajectories and surpassing existing methods in challenging, contact-rich manipulation tasks such as non-prehensile planar pushing with obstacles, object sorting, and rope routing in both simulated and real-world settings. 4 EXPERIMENTAL RESULTS In this section, we evaluate the performance of our Ba B-ND on a range of complex robotic manipulation tasks. Our primary objective is to investigate three key questions through experiments: (1) How effectively does Ba B-ND handle long-horizon planning? (2) Is Ba B-ND applicable to diverse manipulation scenarios involving multi-object interactions and deformable objects? (3) How does the scalability of Ba B-ND compare to existing methods? Figure 4: Optimization result on a synthetic f(u) over increasing dimensions d. Ba B-ND outperforms all baselines in terms of the optimized objective. We run all methods multiple times and visualize the median values with 25th and 75th percentiles in the shaded area. Figure 6: Quantitative analysis of planning performance and execution performance in real world. Ba BND consistently outperforms baselines on open-loop performance leading to better closed-loop performance. Table 4: Ablation study on branching and bounding components
Researcher Affiliation Collaboration Keyi Shen1 , Jiangwei Yu1 , Jose Barreiros2, Huan Zhang1, Yunzhu Li3 1University of Illinois Urbana-Champaign 2Toyota Research Institute 3Columbia University
Pseudocode Yes Our main algorithm is shown in Algorithm 1. Algorithm 1 Branch and bound for planning. Comments are in brown. Algorithm 2 Branch and bound for planning. Comments are in brown. Algorithm 4 Rapidly-Exploring Random Tree (RRT) Algorithm 5 Probabilistic Roadmap (PRM) Construction Algorithm 6 Planning over PRM
Open Source Code No Project page: https://robopil.github.io/bab-nd/. Please refer to our project page for video demonstrations.
Open Datasets No For training the dynamics model, we randomly collect interaction data from simulators. For Pushing w/ Obstacles, Object Merging, and Object Sorting tasks, we use Pymunk (Blomqvist, 2022) to collect data, and for the Rope Routing task, we use NVIDIA Fle X (Macklin et al., 2014) to generate data.
Dataset Splits No For training the dynamics model, we randomly collect interaction data from simulators. For Pushing w/ Obstacles, Object Merging, and Object Sorting tasks, we use Pymunk (Blomqvist, 2022) to collect data, and for the Rope Routing task, we use NVIDIA Fle X (Macklin et al., 2014) to generate data. In the following paragraphs, we introduce the data generation process for different tasks in detail. No specific train/test/validation splits are mentioned for the collected data, only the generation process and total episodes.
Hardware Specification No Our approach employs a specialized branching heuristics to divide the search space into subdomains, and applies a modified bound propagation method, inspired by the state-of-the-art neural network verifier α,β-CROWN, to efficiently estimate objective bounds within these subdomains. Our framework, Ba B-ND (Figure 1.a), divides the action space into smaller subdomains through novel branching heuristics (branching), estimates objective bounds using a modified bound propagation procedure to prune subdomains that cannot yield better solutions (bounding), and focuses searching on the most promising subdomains (searching). GPU support. Our implementation benefits significantly from GPU acceleration, achieving over 10x speedup compared to the CPU version, even for small planning problems. No specific GPU or CPU models are mentioned, only general statements about GPU acceleration.
Software Dependencies No For Pushing w/ Obstacles, Object Merging, and Object Sorting tasks, we use Pymunk (Blomqvist, 2022) to collect data, and for the Rope Routing task, we use NVIDIA Fle X (Macklin et al., 2014) to generate data. This lists software but does not provide specific version numbers, only citations to papers or years.
Experiment Setup Yes Pushing w/ Obstacles. We use a four-layer MLP with [128, 256, 256, 128] neurons in each respective layer. The model is trained with an Adam optimizer for 7 epochs, using a learning rate of 0.001. A cosine learning rate scheduler is applied to regularize the learning rate. Object Merging. We use the same architecture, optimizer, training epochs, and learning rate scheduler as in the Pushing w/ Obstacles setup. Rope Routing. We use a two-layer MLP with 128 neurons in each layer. The model is trained with an Adam optimizer for 50 epochs, with a learning rate of 0.001, and a cosine learning rate scheduler to adjust the learning rate. Object Sorting. We use the same architecture as DPI-Net (Li et al., 2018). The model is trained with an Adam optimizer for 15 epochs, with a learning rate of 0.001, and a cosine learning rate scheduler to adjust the learning rate. D.6 HYPERPARAMETERS OF BASELINES Table 10: Hyperparameter setting for baselines