Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Actionable Interpretability

Towards Understanding the Mechanisms of Classifier-Free Guidance

Xiang Li · Rongrong Wang · Qing Qu


Abstract:

Classifier-free guidance (CFG) is a core technique powering state-of-the-art image generation systems, yet its underlying mechanisms remain poorly understood. In this work, we first analyze CFG in a simplified linear diffusion model, where its behavior closely mirrors that observed in the nonlinear case. Our analysis reveals that linear CFG improves generation quality via three distinct components: (i) a mean-shift term that steers samples toward the class mean, (ii) a positive Contrastive Principal Components (CPC) term that amplifies class-specific features, and (iii) a negative CPC term that suppresses generic features present in unconditional data. We then verify that these insights extend to real-world, nonlinear diffusion models: over a broad range of noise levels, linear CFG replicates the behavior of its nonlinear counterpart. Although the two eventually diverge at low noise levels, our theoretical insights from the linear setting guide us in constructing effective guidance directions within the nonlinear regime.

Chat is not available.