Workshop
DIG-BUGS: Data in Generative Models (The Bad, the Ugly, and the Greats)
Khoa Doan · Franziska Boenisch · Adam Dziedzic · Aniruddha Saha · Viet Anh Nguyen · Zhenting Wang · Bo Li · Heather Zheng
West Ballroom A
Sat 19 Jul, 8:55 a.m. PDT
Generative models have become extremely powerful and are now integral to various aspects of daily life from creative arts to customer service. Given their increasing interaction with people, ensuring their trustworthiness is crucial. This workshop centers on the idea that the safety and reliability of generative models are deeply connected to the nature and treatment of their training data. We aim to explore the hypothesis that building reliable and trustworthy artificial intelligence (AI) systems based on generative models must start with high-quality and responsibly managed data.The workshop will focus on several key areas where training data impacts the trustworthiness of generative models. Among others, we will address 1) privacy concerns, highlighting how improper inclusion and handling of sensitive information in the training data can lead to significant privacy violations; 2) safety risks, like backdoors and data poisoning that threaten robust generations; and 3) the impact of biases in generative models' training data, which can cause models to perpetuate or even amplify societal biases, resulting in unfair outcomes.Through expert talks, panel discussions, and interactive sessions, participants will delve into these issues and explore strategies for developing safe, trustworthy, and reliable generative models. This workshop aims to foster collaboration and drive forward research to ensure that generative models, as they become more embedded in our lives, do so in a trustworthy and beneficial manner.
Schedule
Sat 8:55 a.m. - 9:00 a.m.
|
Welcome Remarks
(
Intro
)
>
|
Khoa Doan 🔗 |
Sat 9:00 a.m. - 9:30 a.m.
|
Dynamic & Stateful Evals of Safety on the Frontier: What can Academics do?
(
Invited Talk
)
>
|
Eric Wong 🔗 |
Sat 9:30 a.m. - 9:45 a.m.
|
Preference Leakage: A Contamination Problem in LLM-as-a-judge ( Oral ) > link | Dawei Li · Renliang Sun · Yue Huang · Ming Zhong · Bohan Jiang · Jiawei Han · Xiangliang Zhang · Wei Wang · huan liu 🔗 |
Sat 9:45 a.m. - 10:00 a.m.
|
Why LLM Safety Guardrails Collapse After Fine-tuning: A Similarity Analysis Between Alignment and Fine-tuning Datasets ( Oral ) > link | Lei Hsiung · Tianyu Pang · Yung-Chen Tang · Linyue Song · Tsung-Yi Ho · Pin-Yu Chen · Yaoqing Yang 🔗 |
Sat 10:00 a.m. - 10:30 a.m.
|
Coffee Break
|
🔗 |
Sat 10:30 a.m. - 11:00 a.m.
|
There's No Free Lunch in Safety in Fine-tuning Large Language Models
(
Invited Talk
)
>
|
Pin-Yu Chen 🔗 |
Sat 11:00 a.m. - 11:15 a.m.
|
Both Text and Images Leaked! A Systematic Analysis of Data Contamination in Multimodal LLMs ( Oral ) > link | Dingjie Song · Sicheng Lai · Mingxuan Wang · Shunian Chen · Lichao Sun · Benyou Wang 🔗 |
Sat 11:15 a.m. - 11:45 a.m.
|
Building Trustworthy LLMs: How Data Quality Shapes Performance and Where It Falls Short?
(
Invited Talk
)
>
|
Nouha Dziri 🔗 |
Sat 11:45 a.m. - 1:00 p.m.
|
Lunch Break
|
🔗 |
Sat 1:00 p.m. - 1:30 p.m.
|
Data-centric LM research on an academic budget
(
Invited Talk
)
>
|
Tatsunori Hashimoto 🔗 |
Sat 1:30 p.m. - 1:45 p.m.
|
Training Diffusion Models with Noisy Data via SFBD Flow ( Oral ) > link | Haoye Lu · Darren Lo · Yaoliang Yu 🔗 |
Sat 1:45 p.m. - 2:15 p.m.
|
How (not) to hack AI?
(
Invited Talk
)
>
|
Ivan Evtimov 🔗 |
Sat 2:15 p.m. - 2:30 p.m.
|
Unlocking Post-hoc Dataset Inference with Synthetic Data ( Oral ) > link | Bihe Zhao · Pratyush Maini · Franziska Boenisch · Adam Dziedzic 🔗 |
Sat 2:30 p.m. - 3:00 p.m.
|
Coffee Break
|
🔗 |
Sat 3:00 p.m. - 3:45 p.m.
|
Poster Session
(
Posters
)
>
|
🔗 |
Sat 3:00 p.m. - 3:45 p.m.
|
Spectral Manifold Harmonization for Graph Imbalanced Regression ( Poster ) > link | Brenda Cruz Nogueira 🔗 |
|
→ Watermarking Image Autoregressive Models ( Poster ) > link | Michel Meintz · Jan Dubiński · Franziska Boenisch · Adam Dziedzic 🔗 |
|
→ A Data-Centric Safety Framework for Generative Models: Adversarial Fingerprint Detection and Attribution ( Poster ) > link | Dong Liu · Yanxuan Yu 🔗 |
|
→ Generalizing Trust: Weak-to-Strong Trustworthiness in Language Models ( Poster ) > link | Lillian Sun · Martin Pawelczyk · Zhenting Qi · Aounon Kumar · Himabindu Lakkaraju 🔗 |
|
→ COREVQA: A Crowd Observation and Reasoning Entailment Visual Question Answering Benchmark ( Poster ) > link | Charles Duong · Naaisha Agarwal · Ishant Chintapatla · Kazuma Choji · Andrew Lwin · Hannah You · Kevin Zhu · Sean O'Brien · Vasu Sharma 🔗 |
|
→ Risks of AI Scientists: Prioritizing Safeguarding Over Autonomy ( Poster ) > link | Robert Tang · Kunlun Zhu · Tongxin Yuan · Yichi Zhang · Wangchunshu Zhou · Zhuosheng Zhang 🔗 |
|
→ SOSBENCH: Benchmarking Safety Alignment on Scientific Knowledge ( Poster ) > link | Fengqing Jiang · Fengbo Ma · Zhangchen Xu · Yuetai Li · Bhaskar Ramasubramanian · Luyao Niu · Bo Li · Xianyan Chen · Zhen Xiang · Radha Poovendran 🔗 |
|
→ FaceSafe: An Inpainting Pipeline for Privacy-Compliant Scalable Image Datasets ( Poster ) > link | Sydney Su · Lening Cui · Ananya Salian · Roger You · Hao Cui · Charles Duong · Kevin Zhu · Sean O'Brien · Vasu Sharma 🔗 |
|
→ MAD-MAX: Modular And Diverse Malicious Attack MiXtures for Automated LLM Red Teaming ( Poster ) > link | Stefan Schoepf · Muhammad Zaid Hameed · Ambrish Rawat · Kieran Fraser · Giulio Zizzo · Giandomenico Cornacchia · Mark Purcell 🔗 |
|
→ R&B: Breaking the Data Mixing Bottleneck with Just 0.01% Overhead ( Poster ) > link | Albert Ge · Tzu-Heng Huang · John Cooper · Avi Trost · Ziyi Chu · Satya Sai Srinath Namburi GNVV · Jack Cai · Kendall Park · Nicholas Roberts · Frederic Sala 🔗 |
|
→ Model-based Large Language Model Customization as Service ( Poster ) > link | Zhaomin Wu · Jizhou Guo · Junyi Hou · Bingsheng He · Lixin Fan · Qiang Yang 🔗 |
|
→ Firm Foundations for Membership Inference Attacks Against Large Language Models ( Poster ) > link | Jeffrey Wang · Jason Wang · Marvin Li · Seth Neel 🔗 |
|
→ Weak-to-strong Generalization via Formative Learning from Student Demonstrations & Teacher Evaluation ( Poster ) > link | Nguyen Phuc · Chinh La · Heng Ji · Khoa Doan 🔗 |
|
→ Optimization and Robustness-Informed Membership Inference Attacks for LLMs ( Poster ) > link | Zichen Song · Qixin Zhang · Ming Li · Yao Shu 🔗 |
|
→ Lookahead Bias in Pretrained Language Models ( Poster ) > link | Suproteem Sarkar · Keyon Vafa 🔗 |
|
→ In-Context Bias Propagation in LLM-Based Tabular Data Generation ( Poster ) > link | Pol G. Recasens · Alberto Gutierrez-Torre · Jordi Torres · Josep Lluís Berral · Anisa Halimi · Kieran Fraser 🔗 |
|
→ Cascading Adversarial Bias from Injection to Distillation in Language Models ( Poster ) > link | Harsh Chaudhari · Jamie Hayes · Matthew Jagielski · Ilia Shumailov · Milad Nasr · Alina Oprea 🔗 |
|
→ Ghost in the Cloud: Your Geo-Distributed Large Language Models Training is Easily Manipulated ( Poster ) > link | Zichen TANG · Zhenheng Tang · Gaoning Pan · Buhua Liu · Kunfeng Lai · Xiaowen Chu · Bo Li 🔗 |
|
→ Improvement-Guided Iterative DPO for Diffusion Models ( Poster ) > link | Ying Fan · Fei Deng · Yang Zhao · Sahil Singla · Rahul Jain · Tingbo Hou · Kangwook Lee · Feng Yang · Deepak Ramachandran · Qifei Wang 🔗 |
|
→ Implementing Adaptations for Vision AutoRegressive Model ( Poster ) > link | Kaif Shaikh · Antoni Kowalczuk · Franziska Boenisch · Adam Dziedzic 🔗 |
|
→ RN-F: A Novel Approach for Mitigating Contaminated Data in Large Language Models ( Poster ) > link | Vu Anh Le · Dinh Nguyen · Phi Nguyen · Keshav Sood 🔗 |
|
→ DP-AdamW: Investigating Decoupled Weight Decay and Bias Correction in Private Deep Learning ( Poster ) > link | Lillian Sun · Kevin Cong · Jay Chooi · Russell Li 🔗 |
|
→ Backdooring VLMs via Concept-Driven Triggers ( Poster ) > link | Yufan Feng · Weimin Lyu · Yuxin Wang · Benjamin Tan · Yani Ioannou 🔗 |
|
→ JailbreakLoRA: Your Downloaded LoRA from Sharing Platforms might be Unsafe ( Poster ) > link | Fanjunduo Wei · Zhenheng Tang · Rongfei Zeng · Tongliang Liu · Chengqi Zhang · Xiaowen Chu · Bo Han 🔗 |
|
→ Diversity Boosts AI-Generated Text Detection ( Poster ) > link | Advik Basani · Pin-Yu Chen 🔗 |
|
→ OVERT: A Benchmark for Over-Refusal Evaluation on Text-to-Image Models ( Poster ) > link | Ziheng Cheng · Yixiao Huang · Hui Xu · Somayeh Sojoudi · Xuandong Zhao · Dawn Song · Song Mei 🔗 |
|
→ Data Cartography for Detecting Memorization Hotspots and Guiding Data Interventions in Generative Models ( Poster ) > link | Laksh Patel · Neel Shanbhag 🔗 |
|
→ Detective SAM: Adapting SAM to Localize Diffusion-based Forgeries via Embedding Artifacts ( Poster ) > link | Gert Lek · Chaoyi Zhu · Pin-Yu Chen · Robert Birke · Lydia Y. Chen 🔗 |
|
→ TruthLens: Training-Free Data Verification for Deepfake Images via VQA-style Probing ( Poster ) > link | Ritabrata Chakraborty · Rajatsubhra Chakraborty · Ali K. Rahimian 🔗 |
|
→ Layer-wise Influence Tracing: Data-Centric Mitigation of Memorization in Diffusion Models ( Poster ) > link | Thomas Chen 🔗 |
|
→ A Representation Engineering Perspective on the Effectiveness of Multi-Turn Jailbreaks ( Poster ) > link |
11 presentersBlake Bullwinkel · Mark Russinovich · Ahmed Salem · Santiago Zanella-Beguelin · Dan Jones · Giorgio Severi · Eugenia Kim · Keegan Hines · Amanda Minnich · Yonatan Zunger · Ram Shankar Siva Kumar |
|
→ Optimal Defenses Against Data Reconstruction Attacks ( Poster ) > link | Yuxiao Chen · Gamze Gursoy · Qi Lei 🔗 |
Sat 3:45 p.m. - 4:15 p.m.
|
On Specification Data
(
Invited Talk
)
>
|
Serena Booth 🔗 |
Sat 4:15 p.m. - 4:45 p.m.
|
Panel Discussion
(
Panel
)
>
|
Eric Wong · Pin-Yu Chen · Ivan Evtimov · Nouha Dziri · Serena Booth 🔗 |
Sat 4:45 p.m. - 4:55 p.m.
|
Best Paper Awards
(
Awards
)
>
|
Khoa Doan 🔗 |
Sat 4:55 p.m. - 5:00 p.m.
|
Concluding Remarks
(
Closing
)
>
|
Khoa Doan 🔗 |