Poster
REG: Rectified Gradient Guidance for Conditional Diffusion Models
Zhengqi Gao · Kaiwen Zha · Tianyuan Zhang · Zihui Xue · Duane Boning
East Exhibition Hall A-B #E-2609
Guidance techniques are simple yet effective for improving conditional generation in diffusion models. Albeit their empirical success, the practical implementation of guidance diverges significantly from its theoretical motivation. In this paper, we reconcile this discrepancy by replacing the scaled marginal distribution target, which we prove theoretically invalid, with a valid scaled joint distribution objective. Additionally, we show that the established guidance implementations are approximations to the intractable optimal solution under no future foresight constraint. Building on these theoretical insights, we propose rectified gradient guidance (REG), a versatile enhancement designed to boost the performance of existing guidance methods. Experiments on 1D and 2D demonstrate that REG provides a better approximation to the optimal solution than prior guidance techniques, validating the proposed theoretical framework. Extensive experiments on class-conditional ImageNet and text-to-image generation tasks show that incorporating REG consistently improves FID and Inception/CLIP scores across various settings compared to its absence.
Diffusion models can create realistic images from random noise. To guide these models toward specific goals—like generating images of a certain class—techniques called “guidance” are used. However, current guidance methods don’t fully align with their theoretical foundations. Our work resolves this mismatch by proposing a new, more accurate theory and a method called Rectified Gradient Guidance (REG). REG improves the quality of generated images across multiple tasks while remaining compatible with existing systems, helping make diffusion models more reliable and effective.