ICML Interpretable Diffusion Models with B-cos Networks

Poster
in
Workshop: Actionable Interpretability

Interpretable Diffusion Models with B-cos Networks

Nicola Bernold · Moritz Vandenhirtz · Alice Bizeul · Julia Vogt

[ Abstract ] [ Project Page ]

[ OpenReview]

Sat 19 Jul 10:40 a.m. PDT — 11:40 a.m. PDT

Abstract:

Text-to-image diffusion models generate images by iteratively denoising random noise, conditioned on a textual prompt. While these models have enabled impressive progress in realistic image generation, they often fail to accurately reflect specific characteristics described in the prompt - failures that are difficult to detect automatically. In this work, we introduce a diffusion model architecture built with B-cos modules that offers inherent interpretability. Our approach enables a better understanding of how individual prompt tokens influence the generated image, bridging the gap between performance and transparency. We demonstrate that B-cos diffusion models can produce high-quality images while providing meaningful insights into prompt-image alignment.

Chat is not available.

Poster in Workshop: Actionable Interpretability

Interpretable Diffusion Models with B-cos Networks

Nicola Bernold · Moritz Vandenhirtz · Alice Bizeul · Julia Vogt

Poster
in
Workshop: Actionable Interpretability