Physics-informed Temporal Difference Metric Learning for Robot Motion Planning
Authors: Ruiqi Ni, zherong pan, Ahmed Hussain Qureshi
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our framework on complex planning tasks with C-space ranging from 2-12 DOF and demonstrate its scalability to complex scenes and generalization ability to multiple and unseen environments. Our results show that our proposed approach significantly outperformed prior state-of-the-art learning-based planning methods. Additionally, we compare our proposed metric learning approach with other metrics commonly used in Reinforcement Learning (RL) for the value function learning. Our results demonstrate that our metric better captures the key properties of the Eikonal equation, leading to a more accurate approximation of its solution. |
| Researcher Affiliation | Collaboration | Ruiqi Ni1, Zherong Pan2, Ahmed H. Qureshi1 1Purdue University, 2Lightspeed Studios EMAIL |
| Pseudocode | No | The paper includes mathematical equations and descriptions of methods but does not contain any explicitly labeled pseudocode or algorithm blocks with structured steps. |
| Open Source Code | Yes | The implementation code repository is available at https://github.com/ruiqini/ ntrl-demo. |
| Open Datasets | Yes | We selected ten environments from the Gibson dataset (Li et al., 2021), with room counts ranging from 7 to 16 and dimensions between 90 and 430 square meters. These tasks, adopted from the MPi Net dataset (Fishman et al., 2023), require a 7-DOF robot arm to navigate among multiple obstacle blocks on a tabletop. These environments are taken from the C3D dataset (Qureshi et al., 2019; 2020) and consist of 10 cubes of varying sizes randomly placed in a 3D space. |
| Dataset Splits | Yes | In each environment, we evaluated 100 unseen start and goal pairs. For these tasks [C3D], we selected 100 seen and 100 unseen environments. The models were trained on the seen environments. For testing, we chose 500 random start and goal pairs across both seen and unseen environments. We choose 150 seen and 150 unseen environments [7-DOF Manipulator] and train neural models on the seen environment. For testing, we select 300 start and goal pairs in both seen and unseen environments. |
| Hardware Specification | Yes | Furthermore, all experiments and evaluations were conducted on a system with a 3.50GHz 8 Intel Core i9 processor, 32 GB RAM, and Ge Force RTX 3090 GPU. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies such as libraries or frameworks used in the implementation. |
| Experiment Setup | Yes | In 3D environment, we choose λE = 10 2, λT D = 10 3, λN = 10 3, λC = 0.5 as the hyperparameters, and we choose TD step t = 0.02. However, for manipulator environments, the free space is much smaller than 3D space, and large TD step and normal direction can lead to the wrong place, so we reduce to t = 0.005, and λN = 2 10 4. |