ICML Poster MimicMotion: High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance

Poster

MimicMotion: High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance

Yuang Zhang · Jiaxi Gu · Li-Wen Wang · Han Wang · JunqiCheng · Yuefeng Zhu · FangYuan Zou

East Exhibition Hall A-B #E-2909

[ Abstract ] [ Lay Summary ] [ Project Page ]

[ Slides] [ Poster] [ OpenReview]

Thu 17 Jul 4:30 p.m. PDT — 7 p.m. PDT

Abstract:

In recent years, while generative AI has advanced significantly in image generation, video generation continues to face challenges in controllability, length, and detail quality, which hinder its application. We present MimicMotion, a framework for generating high-quality human videos of arbitrary length using motion guidance. Our approach has several highlights. Firstly, we introduce confidence-aware pose guidance that ensures high frame quality and temporal smoothness. Secondly, we introduce regional loss amplification based on pose confidence, which reduces image distortion in key regions. Lastly, we propose a progressive latent fusion strategy to generate long and smooth videos. Experiments demonstrate the effectiveness of our approach in producing high-quality human motion videos. Videos and comparisons are available at https://tencent.github.io/MimicMotion.

Lay Summary:

Generating realistic videos with AI remains challenging, particularly when creating detailed human movements over long durations. To tackle this, we developed MimicMotion, an AI framework that generates smooth, realistic videos of humans performing movements of any length. Our method uses pose templates to guide human video generation precisely, ensuring each frame stays clear and movements flow naturally. We also enhance hand generation to prevent distortion and employ a progressive blending technique to produce smoother, longer videos. This technology opens up new possibilities in areas like filmmaking and animation.

Chat is not available.