ICML Training Diffusion Models with Noisy Data via SFBD Flow

Oral
in
Workshop: DIG-BUGS: Data in Generative Models (The Bad, the Ugly, and the Greats)

Training Diffusion Models with Noisy Data via SFBD Flow

Haoye Lu · Darren Lo · Yaoliang Yu

Keywords: [ diffusion models ] [ ambient diffusion ] [ deconvolution ]

[ Abstract ] [ Project Page ]

[ OpenReview]

Sat 19 Jul 1:30 p.m. PDT — 1:45 p.m. PDT

Abstract:

Diffusion models achieve strong generative performance but often rely on large datasets that may include sensitive content. This challenge is compounded by the models’ tendency to memorize training data, raising privacy concerns. SFBD (Lu et al., 2025) addresses this by training on corrupted data and using limited clean samples to capture local structure and improve convergence. However, its iterative denoising and fine-tuning loop requires manual coordination, making it burdensome to implement. We reinterpret SFBD as an alternating projection algorithm and introduce a continuous variant, SFBD flow, that removes the need for alternating steps. We further show its connection to consistency constraint-based methods, and demonstrate that its practical instantiation, Online SFBD, consistently outperforms strong baselines across benchmarks.

Chat is not available.

Oral in Workshop: DIG-BUGS: Data in Generative Models (The Bad, the Ugly, and the Greats)

Training Diffusion Models with Noisy Data via SFBD Flow

Haoye Lu · Darren Lo · Yaoliang Yu

Oral
in
Workshop: DIG-BUGS: Data in Generative Models (The Bad, the Ugly, and the Greats)