Skip to yearly menu bar Skip to main content


Poster

Variational Rectified Flow Matching

Pengsheng Guo · Alex Schwing

East Exhibition Hall A-B #E-3306
[ ] [ ]
Wed 16 Jul 11 a.m. PDT — 1:30 p.m. PDT

Abstract:

We study Variational Rectified Flow Matching, a framework that enhances classic rectified flow matching by modeling multi-modal velocity vector-fields. At inference time, classic rectified flow matching 'moves' samples from a source distribution to the target distribution by solving an ordinary differential equation via integration along a velocity vector-field. At training time, the velocity vector-field is learnt by linearly interpolating between coupled samples one drawn from the source and one drawn from the target distribution randomly. This leads to ''ground-truth'' velocity vector-fields that point in different directions at the same location, i.e., the velocity vector-fields are multi-modal/ambiguous. However, since training uses a standard mean-squared-error loss, the learnt velocity vector-field averages ''ground-truth'' directions and isn't multi-modal. In contrast, variational rectified flow matching learns and samples from multi-modal flow directions. We show on synthetic data, MNIST, CIFAR-10, and ImageNet that variational rectified flow matching leads to compelling results.

Lay Summary:

(1) At the heart of recent advances in diffusion and flow matching methods lies a major assumption: Existing models assume a single deterministic velocity at each point in the data-time space. (2) With Variational Rectified Flow Matching (VRFM), we introduce a new framework that models the multi-modal distribution over velocity directions, unlocking a richer class of generative models. (3) VRFM is a shift in how we think about learning continuous dynamics. We demonstrate consistent, strong performance across datasets (Synthetic, MNIST, CIFAR-10, ImageNet) and architectures (UNet, DiT). This opens the door to more expressive, uncertainty-aware generative modeling.

Chat is not available.