Skip to yearly menu bar Skip to main content


Oral

Inductive Moment Matching

Linqi (Alex) Zhou · Stefano Ermon · Jiaming Song

West Exhibition Hall C
[ ] [ Visit Oral 2A Diffusion Models ]
Tue 15 Jul 4 p.m. — 4:15 p.m. PDT

Abstract:

Diffusion models and Flow Matching generate high-quality samples but are slow at inference, and distilling them into few-step models often leads to instability and extensive tuning. To resolve these trade-offs, we propose Moment Matching Self-Distillation (MMSD), a new class of generative models for one- or few-step sampling with a single-stage training procedure. Unlike distillation, MMSD does not require pre-training initialization and optimization of two networks; and unlike Consistency Models, MMSD guarantees distribution-level convergence and remains stable under various hyperparameters and standard model architectures. MMSD surpasses diffusion models on ImageNet-256x256 with 2.13 FID using only 8 inference steps and achieves state-of-the-art 2-step FID of 2.05 on CIFAR-10 for a model trained from scratch.

Chat is not available.