Motion Reconstruction and Imitation from Monocular Videos

We present SLoMo: a first-of-its-kind framework for transferring skilled motions from casually captured “in- the-wild” video footage of humans and animals to legged robots. SLoMo works in three stages: 1) synthesize a physically plausible reconstructed key-point trajectory from monocular videos; 2) optimize a dynamically feasible reference trajectory for the robot offline that includes body and foot motion, as well as a contact sequence that closely tracks the key points; 3) track the reference trajectory online using a general-purpose model-predictive controller on robot hardware. Traditional motion imitation for legged motor skills often requires expert animators, collaborative demonstrations, and/or expensive motion-capture equipment, all of which limits scalability. Instead, SLoMo only relies on easy-to-obtain videos, readily available in online repositories such as YouTube. It converts videos into motion primitives that can be executed reliably by real-world robots. We demonstrate our approach by transferring the motions of cats, dogs, and humans to example robots including a quadruped (on hardware) and a humanoid (in simulation).

Check our websites:

PPR
SLoMo

Related Papers

2024 May	SLoMo: A General System for Legged Robot Motion Imitation from Casual Videos John Zhang, Shuo Yang, Gengshan Yang, Arun Bishop, Swaminathan Gurumurthy, Deva Ramanan, and Zac Manchester Robotics and Automation Letters (RA-L) & International Conference on Robotics and Automation (ICRA)
2023 October	PPR: Physically Plausible Reconstruction from Monocular Videos Gengshan Yang, Shuo Yang, John Zhang, Zac Manchester, and Deva Ramanan IEEE International Conference on Computer Vision (ICCV). Paris, France.

Motion Reconstruction and Imitation from Monocular Videos

Related Papers

People

John Zhang

Shuo Yang

Arun Bishop

Swaminathan Gurumurthy

Zac Manchester