Spotlight Poster
Direct Discriminative Optimization: Your Likelihood-Based Visual Generative Model is Secretly a GAN Discriminator
Kaiwen Zheng · Yongxin Chen · Huayu Chen · Guande He · Ming-Yu Liu · Jun Zhu · Qinsheng Zhang
East Exhibition Hall A-B #E-3111
Modern AI image generators can produce stunning visuals, but the way they are trained still has a key weakness: they tend to play it safe by trying to cover all possible outputs, which leads to blurry or less realistic images when the model size is limited. This happens because the most common training method (called maximum likelihood estimation) encourages covering every possibility rather than focusing on the most likely or high-quality results.To overcome this, we introduce a new approach called Direct Discriminative Optimization (DDO). It improves training by helping the model learn from its own mistakes — identifying and correcting low-quality outputs — without needing a separate "judge" model like in GANs. Inspired by recent techniques in AI alignment, DDO works by comparing a model to a fixed reference and using that difference as a learning signal.DDO can upgrade existing models efficiently and drastically improve image quality. In our experiments, it helped leading AI models produce sharper, more realistic images across several standard benchmarks — outperforming previous best results without needing extra tricks or more training data.