Learning Efficient Robotic Garment Manipulation with Standardization

Authors: Changshi Zhou, Feng Luan, Jiarui Hu, Shaoqiang Meng, Zhipeng Wang, Yanchao Dong, Yanmin Zhou, Bin He

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In simulation, APS-Net outperforms state-of-the-art methods for long sleeves, achieving 3.9% better coverage, 5.2% higher Io U, and a 0.14 decrease in KD (7.09% relative reduction). Realworld folding tasks further demonstrate that standardization simplifies the folding process. We conduct extensive real-world experiments on garments from three categories long sleeve, jumpsuits, and skirts demonstrating the effectiveness of APSNet in both garment standardization and subsequent folding tasks using a dual-UR5 robot.
Researcher Affiliation Academia 1Shanghai Research Institute for Intelligent Autonomous Systems, Tongji University, Shanghai, China 2The National Key Laboratory of Autonomous Intelligent Unmanned Systems, Tongji University, Shanghai 201210, China 3The Frontiers Science Center for Intelligent Autonomous Systems, Shanghai 201210, China 4College of Electronics and Information Engineering, Tongji University, Shanghai 201804, China. Correspondence to: Yanmin Zhou <EMAIL>.
Pseudocode No The paper describes methods and mathematical formulations in regular paragraph text and equations but does not contain a clearly labeled pseudocode or algorithm block.
Open Source Code No Project page: https://hellohaia.github.io/APS/. (Upon visiting the project page, it states: "Code will be released soon.")
Open Datasets Yes The cloth meshes were sampled from a subset of shirts in the test split of the CLOTH3D dataset(Bertiche et al., 2020)
Dataset Splits Yes Each garment category includes 2000 training tasks and 50 testing tasks using unseen garment meshes. This process resulted in a dataset of 10000 images for training and 500 for testing.
Hardware Specification Yes All experiments are conducted on an NVIDIA RTX 4090 GPU with an Intel i9-13900K CPU (5.80 GHz), supported by 64 GB RAM on Ubuntu 18.04 LTS.
Software Dependencies No The paper mentions several software components like Open AI Gym API, Py Flex, Nvidia Flex, Soft Gym, Blender 3D, Deep Labv3, and Grounded-SAM model, but does not provide specific version numbers for these key software components or programming languages/frameworks used for implementation.
Experiment Setup Yes For the unfolding network, the initial observation ot of size (H, W) = (128, 128) undergoes 16 rotations (covering 360 ) and 5 scales {0.75, 1.0, 1.5, 2.0, 2.5}, yielding m = 80 stacked transformed observations. Exploration follows a decaying ϵ-greedy strategy (initial ϵ = 1) with half-lives of 5000 steps for selecting action primitives (fling vs. p&p) . Optimization is conducted using the Adam optimizer with a learning rate of 1.0 10 3 over 100,000 steps. For the key-point detection model, training is conducted for 200 epochs with a batch size of 32, a learning rate of 1.0 10 4, and includes rotation and scaling augmentations.