ICML Low Resource Reconstruction Attacks Through Benign Prompts

Oral
in
Workshop: The Impact of Memorization on Trustworthy Foundation Models

Low Resource Reconstruction Attacks Through Benign Prompts

Sol Yarkoni · Roi Livni

[ Abstract ] [ Project Page ]

[ OpenReview]

Sat 19 Jul 2:15 p.m. PDT — 2:30 p.m. PDT

presentation: The Impact of Memorization on Trustworthy Foundation Models
Sat 19 Jul 8:25 a.m. PDT — 5 p.m. PDT

Abstract:

The rising popularity of diffusion models, have raised serious concerns around privacy, copyright, and data leakage. Prior work has demonstrated that training data can be partially reconstructed, but these attacks often require significant resources, training set access, or carefully crafted prompts. In this work, we present a low-resource attack that reveals a more subtle risk: even seemingly innocuous prompts can lead to the unintended reconstruction of real training images. Strikingly, we show that prompts like “Abstract Art Unisex T-Shirt” can generate identifiable human faces included in the training data. Our findings point to a systemic vulnerability rooted in the use of scraped e-commerce data, where templated layouts tightly couple visual content with prompt patterns. This raises new concerns about the ease with which unintentional data leaks may occur.

Chat is not available.

Oral in Workshop: The Impact of Memorization on Trustworthy Foundation Models

Low Resource Reconstruction Attacks Through Benign Prompts

Sol Yarkoni · Roi Livni

Oral
in
Workshop: The Impact of Memorization on Trustworthy Foundation Models