In-Context Policy Adaptation via Cross-Domain Skill Diffusion

Authors: Minjong Yoo, Woo Kyung Kim, Honguk Woo

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through experiments with robotic manipulation in Metaworld and autonomous driving in CARLA, we show that our ICPAD framework achieves superior policy adaptation performance under limited target domain data conditions for various cross-domain configurations including differences in environment dynamics, agent embodiment, and task horizon.
Researcher Affiliation Academia Minjong Yoo, Woo Kyung Kim, Honguk Woo* Department of Computer Science and Engineering, Sungkyunkwan University EMAIL
Pseudocode Yes Algorithm 1: Learning procedure for ICPAD framework
Open Source Code No The paper does not provide any specific links to code repositories, explicit statements about code release, or mention of code in supplementary materials.
Open Datasets Yes Through a series of experiments encompassing robotic manipulation in Metaworld (Yu et al. 2020) and autonomous driving in CARLA (Dosovitskiy et al. 2017), we demonstrate that ICPAD is applicable for a variety of cross-domain configurations including varied environment dynamics and different embodiment conditions.
Dataset Splits No For in-context policy adaptation, we use 5-shot demonstrations for each task. The data requirements for offline skill learning are consistent with those of existing skill-based RL (Pertsch, Lee, and Lim 2021; Shi, Lim, and Lee 2022). For the target domain, the data requirements align with the few-shot adaptation research (Xu et al. 2022; Hakhamaneshi et al. 2022; He et al. 2024).
Hardware Specification No The paper does not provide any specific details about the hardware used for experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper does not specify any software dependencies with version numbers, such as programming languages, libraries, or frameworks.
Experiment Setup No We use the Metaworld and multi-stage Metaworld with additional action noise and wind settings. For in-context policy adaptation, we use 5-shot demonstrations for each task. The domain disparity indicates the difference between the source, which includes offline data and online source domains, and target domains (i.e., differences in action noise or wind). Details of the domain disparity settings are in Appendix.