Spotlight Poster
Improving Consistency Models with Generator-Augmented Flows
Thibaut Issenhuth · Sangchul Lee · Ludovic Dos Santos · Jean-Yves Franceschi · Chansoo Kim · alain rakotomamonjy
East Exhibition Hall A-B #E-3200
Consistency models imitate the multi-step sampling of score-based diffusion in a single forward pass of a neural network.They can be learned in two ways: consistency distillation and consistency training. The former relies on the true velocity field of the corresponding differential equation, approximated by a pre-trained neural network.In contrast, the latter uses a single-sample Monte Carlo estimate of this velocity field.The related estimation error induces a discrepancy between consistency distillation and training that, we show, still holds in the continuous-time limit.To alleviate this issue, we propose a novel flow that transports noisy data towards their corresponding outputs derived from a consistency model.We prove that this flow reduces the previously identified discrepancy and the noise-data transport cost.Consequently, our method not only accelerates consistency training convergence but also enhances its overall performance. The code is available at https://github.com/thibautissenhuth/consistency_GC.
Most image generation models work by gradually turning random noise into a clear image, a process that can be slow and resource-intensive. To speed this up, methods like Consistency Models (CMs) have been developed. These neural network-based models can generate images in just one step instead of many. There are two ways to train CMs: (i) by imitating an already trained diffusion model, or (ii) by training from scratch without using such a pre-trained model. The second method is attractive because it doesn’t require an existing model. The question we address is whether these two training methods are equivalent, and we provide a negative answer. Indeed, when a CM is trained from scratch, we prove mathematically that an extra term affects the model, making it different from the first method. To alleviate the effect of this term, we introduce a simple solution called Generator-Augmented Flows. This method feeds the model’s own predictions back into its training process.As a result, Generator-Augmented Flows help the model learn faster while generating better images. These findings show how important it is to design training methods that reduce randomness during the training of CMs.