Poster
RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers
Min Zhao · Guande He · Yixiao Chen · Hongzhou Zhu · Chongxuan Li · Jun Zhu
East Exhibition Hall A-B #E-3204
While AI can now create short, high-quality videos, making them significantly longer while keeping motion smooth and non-repetitive over time remains a major hurdle. Existing methods often result in awkward temporal issues like repeating actions or unnatural slowdowns.We investigated how these AI video models handle time. Our analysis revealed that a specific internal mechanism, which we call an "intrinsic frequency," is primarily responsible for these extrapolation problems. Based on this finding, we developed a simple technique named RIFLEx that adjusts this frequency.RIFLEx offers a straightforward way to improve long video generation. It allows advanced AI models to double video length smoothly without any extra training, essentially providing a performance boost "for free." Furthermore, with just a small amount of tuning (even without using long videos), RIFLEx can enable videos to be tripled in length while enhancing quality. This work makes generating extended, coherent video content with AI more feasible.