Skip to yearly menu bar Skip to main content


Poster

How to Move Your Dragon: Text-to-Motion Synthesis for Large-Vocabulary Objects

Wonkwang Lee · Jongwon Jeong · Taehong Moon · Hyeon-Jong Kim · Jaehyeon Kim · Gunhee Kim · Byeong-Uk Lee

East Exhibition Hall A-B #E-3206
[ ] [ ] [ Project Page ]
Tue 15 Jul 11 a.m. PDT — 1:30 p.m. PDT

Abstract:

Motion synthesis for diverse object categories holds great potential for 3D content creation but remains underexplored due to two key challenges: (1) the lack of comprehensive motion datasets that include a wide range of high-quality motions and annotations, and (2) the absence of methods capable of handling heterogeneous skeletal templates from diverse objects.To address these challenges, we contribute the following:First, we augment the Truebones Zoo dataset—a high-quality animal motion dataset covering over 70 species—by annotating it with detailed text descriptions, making it suitable for text-based motion synthesis.Second, we introduce rig augmentation techniques that generate diverse motion data while preserving consistent dynamics, enabling models to adapt to various skeletal configurations.Finally, we redesign existing motion diffusion models to dynamically adapt to arbitrary skeletal templates, enabling motion synthesis for a diverse range of objects with varying structures.Experiments show that our method learns to generate high-fidelity motions from textual descriptions for diverse and even unseen objects, setting a strong foundation for motion synthesis across diverse object categories and skeletal templates.Qualitative results are available on this link.

Lay Summary:

Creating realistic 3D animations of animals, creatures, or objects is often a time-consuming process that requires manual effort and technical expertise. And while recent AI tools can generate motion from text descriptions, they mostly work only for human-like characters, leaving out a wide range of creatures—from birds and horses to dragons.We set out to change that. Our team developed a new approach that teaches AI how to animate a large variety of objects just by reading a short text description, like "a bird taking flight" or "a dragon soaring through the sky." We started by expanding an existing animal motion dataset with rich text labels and then created techniques to help the AI understand and adapt to different skeletons—like wings, tails, or extra limbs.This opens up exciting possibilities for animators, game developers, and artists, making it easier than ever to bring diverse creatures to life. We're also sharing our tools with the community to support future work in this space.

Chat is not available.