ICML A Closer Look at Model Collapse: From a Generalization-to-Memorization Perspective

Oral
in
Workshop: The Impact of Memorization on Trustworthy Foundation Models

A Closer Look at Model Collapse: From a Generalization-to-Memorization Perspective

Lianghe Shi · Meng Wu · Huijie Zhang · Zekai Zhang · Molei Tao · Qing Qu

[ Abstract ] [ Project Page ]

[ OpenReview]

Sat 19 Jul 11 a.m. PDT — 11:15 a.m. PDT

presentation: The Impact of Memorization on Trustworthy Foundation Models
Sat 19 Jul 8:25 a.m. PDT — 5 p.m. PDT

Abstract:

This paper identifies a transition from generalization to memorization over the recursive training of diffusion models, providing a novel perspective for the study of model collapse. Specifically, the models increasingly replicate training data instead of generating novel content during iterative training on self-generated samples. This transition is directly driven by the declining entropy of the synthetic training data produced in each training cycle, which serves as a clear indicator of model degradation. Motivated by this insight, we propose an entropy-based data selection strategy to mitigate the transition from generalization to memorization and quality degradation. Empirical results show that our approach significantly enhances visual quality and diversity in recursive generation, effectively preventing model collapse.

Chat is not available.

Oral in Workshop: The Impact of Memorization on Trustworthy Foundation Models

A Closer Look at Model Collapse: From a Generalization-to-Memorization Perspective

Lianghe Shi · Meng Wu · Huijie Zhang · Zekai Zhang · Molei Tao · Qing Qu

Oral
in
Workshop: The Impact of Memorization on Trustworthy Foundation Models