DexMV: Imitation Learning for Dexterous Manipulation from Human Videos

Demonstration Translation

Raw Video

Pose Estimation

Robot Motion (rendered)

Main Results

We use our dexmv pipeline with DAPG for the imitation learning algorithm component. RL(TRPO) is trained without demonstration.

Pour

Objective: reach the mug and pour the particles inside the mug to a container. The robot need to manipulate the orientation of mug to pour the particles. This task is evaluated by the percentage of particles poured into the container.

Ours

Place Inside

Objective: pick up the banana and then place it inside the mug. The robot needs to rotate the banana to a suitable orientation before place it inside the mug. This task is evaluated by the volume percentage of the banana inside the mug.

Ours

Relocate

Objective: move the object to the target position regardless of orientation. The transparent green shape represents the goal location, which is randomized for each episode. This task is evaluated by the distance between object and target position.

Ours

More Relocate Visualization

Demonstration Transfer

(i) Different Size

Left: we use demonstration on relocating a tomato soup can with normal size and train on a larger tomato soup can.
Right: we use demonstration on relocating a tomato soup can with normal size and train on a smaller tomato soup can.

Larger

Smaller

(ii) Different Object

Left: we use demonstration on relocating a tomato soup can and train on a potted meat can.
Right: we use demonstration on relocating a sugar box and train on a foam brick.

BibTeX

@misc{qin2021dexmv, title={DexMV: Imitation Learning for Dexterous Manipulation from Human Videos}, author={Qin, Yuzhe and Wu, Yueh-Hua and Liu, Shaowei and Jiang, Hanwen, and Yang, Ruihan and Fu, Yang and Wang, Xiaolong}, year={2021}, archivePrefix={arXiv}, primaryClass={cs.LG} }