Skip to yearly menu bar Skip to main content


Invited Talk
in
Workshop: DIG-BUGS: Data in Generative Models (The Bad, the Ugly, and the Greats)

Building Trustworthy LLMs: How Data Quality Shapes Performance and Where It Falls Short?

Nouha Dziri

[ ]
Sat 19 Jul 11:15 a.m. PDT — 11:45 a.m. PDT

Abstract:

In this talk, I'll discuss how strategic data curation can contribute to building trustworthy LLMs through three case studies: Tulu3's balanced data mixing for robust safety and performance, WildTeaming's red-teaming framework and how to use it to generate large-scale crafted safety data that, when integrated with general instruction datasets, leads to best safety performance, and WildGuard's synthesis of synthetic and human data as a safeguard, along with the enhanced version of WildGuard augmented with chain-of-thought reasoning that led to SafetyAnalyst. Beyond this, I'll discuss how models internalize training data through the creativity index framework, revealing how models' pattern-copying behaviors can enable sophisticated reasoning while simultaneously imposing fundamental constraints on model capabilities. Finally, I'll argue that while high-quality data is fundamental to AI safety and robustness, it's not enough to build smart systems. Data-driven approaches alone cannot solve all trustworthiness challenges in generative models.

Chat is not available.