DIG-BUGS: Data in Generative Models (The Bad, the Ugly, and the Greats)

Workshop

DIG-BUGS: Data in Generative Models (The Bad, the Ugly, and the Greats)

Khoa Doan · Franziska Boenisch · Adam Dziedzic · Aniruddha Saha · Viet Anh Nguyen · Zhenting Wang · Bo Li · Heather Zheng

West Ballroom A

Sat 19 Jul, 8:55 a.m. PDT

[ Abstract ] Workshop Website

[ OpenReview]

Generative models have become extremely powerful and are now integral to various aspects of daily life from creative arts to customer service. Given their increasing interaction with people, ensuring their trustworthiness is crucial. This workshop centers on the idea that the safety and reliability of generative models are deeply connected to the nature and treatment of their training data. We aim to explore the hypothesis that building reliable and trustworthy artificial intelligence (AI) systems based on generative models must start with high-quality and responsibly managed data.The workshop will focus on several key areas where training data impacts the trustworthiness of generative models. Among others, we will address 1) privacy concerns, highlighting how improper inclusion and handling of sensitive information in the training data can lead to significant privacy violations; 2) safety risks, like backdoors and data poisoning that threaten robust generations; and 3) the impact of biases in generative models' training data, which can cause models to perpetuate or even amplify societal biases, resulting in unfair outcomes.Through expert talks, panel discussions, and interactive sessions, participants will delve into these issues and explore strategies for developing safe, trustworthy, and reliable generative models. This workshop aims to foster collaboration and drive forward research to ensure that generative models, as they become more embedded in our lives, do so in a trustworthy and beneficial manner.

Chat is not available.

Timezone: America/Los_Angeles

Schedule

Sat 8:55 a.m. - 9:00 a.m.	Welcome Remarks ( Intro ) >	Khoa Doan 🔗
Sat 9:00 a.m. - 9:30 a.m.	Dynamic & Stateful Evals of Safety on the Frontier: What can Academics do? ( Invited Talk ) >	Eric Wong 🔗
Sat 9:30 a.m. - 9:45 a.m.	Preference Leakage: A Contamination Problem in LLM-as-a-judge ( Oral ) > link Link	Dawei Li · Renliang Sun · Yue Huang · Ming Zhong · Bohan Jiang · Jiawei Han · Xiangliang Zhang · Wei Wang · huan liu 🔗
Sat 9:45 a.m. - 10:00 a.m.	Why LLM Safety Guardrails Collapse After Fine-tuning: A Similarity Analysis Between Alignment and Fine-tuning Datasets ( Oral ) > link Link	Lei Hsiung · Tianyu Pang · Yung-Chen Tang · Linyue Song · Tsung-Yi Ho · Pin-Yu Chen · Yaoqing Yang 🔗
Sat 10:00 a.m. - 10:30 a.m.	Coffee Break	🔗
Sat 10:30 a.m. - 11:00 a.m.	There's No Free Lunch in Safety in Fine-tuning Large Language Models ( Invited Talk ) >	Pin-Yu Chen 🔗
Sat 11:00 a.m. - 11:15 a.m.	Both Text and Images Leaked! A Systematic Analysis of Data Contamination in Multimodal LLMs ( Oral ) > link Link	Dingjie Song · Sicheng Lai · Mingxuan Wang · Shunian Chen · Lichao Sun · Benyou Wang 🔗
Sat 11:15 a.m. - 11:45 a.m.	Building Trustworthy LLMs: How Data Quality Shapes Performance and Where It Falls Short? ( Invited Talk ) >	Nouha Dziri 🔗
Sat 11:45 a.m. - 1:00 p.m.	Lunch Break	🔗
Sat 1:00 p.m. - 1:30 p.m.	Data-centric LM research on an academic budget ( Invited Talk ) >	Tatsunori Hashimoto 🔗
Sat 1:30 p.m. - 1:45 p.m.	Training Diffusion Models with Noisy Data via SFBD Flow ( Oral ) > link Link	Haoye Lu · Darren Lo · Yaoliang Yu 🔗
Sat 1:45 p.m. - 2:15 p.m.	How (not) to hack AI? ( Invited Talk ) >	Ivan Evtimov 🔗
Sat 2:15 p.m. - 2:30 p.m.	Unlocking Post-hoc Dataset Inference with Synthetic Data ( Oral ) > link Link	Bihe Zhao · Pratyush Maini · Franziska Boenisch · Adam Dziedzic 🔗
Sat 2:30 p.m. - 3:00 p.m.	Coffee Break	🔗
Sat 3:00 p.m. - 3:45 p.m.	Poster Session ( Posters ) >	🔗
Sat 3:00 p.m. - 3:45 p.m.	Spectral Manifold Harmonization for Graph Imbalanced Regression ( Poster ) > link Link	Brenda Cruz Nogueira 🔗
	→ Watermarking Image Autoregressive Models ( Poster ) > link Link	Michel Meintz · Jan Dubiński · Franziska Boenisch · Adam Dziedzic 🔗
	→ A Data-Centric Safety Framework for Generative Models: Adversarial Fingerprint Detection and Attribution ( Poster ) > link Link	Dong Liu · Yanxuan Yu 🔗
	→ Generalizing Trust: Weak-to-Strong Trustworthiness in Language Models ( Poster ) > link Link	Lillian Sun · Martin Pawelczyk · Zhenting Qi · Aounon Kumar · Himabindu Lakkaraju 🔗
	→ COREVQA: A Crowd Observation and Reasoning Entailment Visual Question Answering Benchmark ( Poster ) > link Link	Charles Duong · Naaisha Agarwal · Ishant Chintapatla · Kazuma Choji · Andrew Lwin · Hannah You · Kevin Zhu · Sean O'Brien · Vasu Sharma 🔗
	→ Risks of AI Scientists: Prioritizing Safeguarding Over Autonomy ( Poster ) > link Link	Robert Tang · Kunlun Zhu · Tongxin Yuan · Yichi Zhang · Wangchunshu Zhou · Zhuosheng Zhang 🔗
	→ SOSBENCH: Benchmarking Safety Alignment on Scientific Knowledge ( Poster ) > link Link	Fengqing Jiang · Fengbo Ma · Zhangchen Xu · Yuetai Li · Bhaskar Ramasubramanian · Luyao Niu · Bo Li · Xianyan Chen · Zhen Xiang · Radha Poovendran 🔗
	→ FaceSafe: An Inpainting Pipeline for Privacy-Compliant Scalable Image Datasets ( Poster ) > link Link	Sydney Su · Lening Cui · Ananya Salian · Roger You · Hao Cui · Charles Duong · Kevin Zhu · Sean O'Brien · Vasu Sharma 🔗
	→ MAD-MAX: Modular And Diverse Malicious Attack MiXtures for Automated LLM Red Teaming ( Poster ) > link Link	Stefan Schoepf · Muhammad Zaid Hameed · Ambrish Rawat · Kieran Fraser · Giulio Zizzo · Giandomenico Cornacchia · Mark Purcell 🔗
	→ R&B: Breaking the Data Mixing Bottleneck with Just 0.01% Overhead ( Poster ) > link Link	Albert Ge · Tzu-Heng Huang · John Cooper · Avi Trost · Ziyi Chu · Satya Sai Srinath Namburi GNVV · Jack Cai · Kendall Park · Nicholas Roberts · Frederic Sala 🔗
	→ Model-based Large Language Model Customization as Service ( Poster ) > link Link	Zhaomin Wu · Jizhou Guo · Junyi Hou · Bingsheng He · Lixin Fan · Qiang Yang 🔗
	→ Firm Foundations for Membership Inference Attacks Against Large Language Models ( Poster ) > link Link	Jeffrey Wang · Jason Wang · Marvin Li · Seth Neel 🔗
	→ Weak-to-strong Generalization via Formative Learning from Student Demonstrations & Teacher Evaluation ( Poster ) > link Link	Nguyen Phuc · Chinh La · Heng Ji · Khoa Doan 🔗
	→ Optimization and Robustness-Informed Membership Inference Attacks for LLMs ( Poster ) > link Link	Zichen Song · Qixin Zhang · Ming Li · Yao Shu 🔗
	→ Lookahead Bias in Pretrained Language Models ( Poster ) > link Link	Suproteem Sarkar · Keyon Vafa 🔗
	→ In-Context Bias Propagation in LLM-Based Tabular Data Generation ( Poster ) > link Link	Pol G. Recasens · Alberto Gutierrez-Torre · Jordi Torres · Josep Lluís Berral · Anisa Halimi · Kieran Fraser 🔗
	→ Cascading Adversarial Bias from Injection to Distillation in Language Models ( Poster ) > link Link	Harsh Chaudhari · Jamie Hayes · Matthew Jagielski · Ilia Shumailov · Milad Nasr · Alina Oprea 🔗
	→ Ghost in the Cloud: Your Geo-Distributed Large Language Models Training is Easily Manipulated ( Poster ) > link Link	Zichen TANG · Zhenheng Tang · Gaoning Pan · Buhua Liu · Kunfeng Lai · Xiaowen Chu · Bo Li 🔗
	→ Improvement-Guided Iterative DPO for Diffusion Models ( Poster ) > link Link	Ying Fan · Fei Deng · Yang Zhao · Sahil Singla · Rahul Jain · Tingbo Hou · Kangwook Lee · Feng Yang · Deepak Ramachandran · Qifei Wang 🔗
	→ Implementing Adaptations for Vision AutoRegressive Model ( Poster ) > link Link	Kaif Shaikh · Antoni Kowalczuk · Franziska Boenisch · Adam Dziedzic 🔗
	→ RN-F: A Novel Approach for Mitigating Contaminated Data in Large Language Models ( Poster ) > link Link	Vu Anh Le · Dinh Nguyen · Phi Nguyen · Keshav Sood 🔗
	→ DP-AdamW: Investigating Decoupled Weight Decay and Bias Correction in Private Deep Learning ( Poster ) > link Link	Lillian Sun · Kevin Cong · Jay Chooi · Russell Li 🔗
	→ Backdooring VLMs via Concept-Driven Triggers ( Poster ) > link Link	Yufan Feng · Weimin Lyu · Yuxin Wang · Benjamin Tan · Yani Ioannou 🔗
	→ JailbreakLoRA: Your Downloaded LoRA from Sharing Platforms might be Unsafe ( Poster ) > link Link	Fanjunduo Wei · Zhenheng Tang · Rongfei Zeng · Tongliang Liu · Chengqi Zhang · Xiaowen Chu · Bo Han 🔗
	→ Diversity Boosts AI-Generated Text Detection ( Poster ) > link Link	Advik Basani · Pin-Yu Chen 🔗
	→ OVERT: A Benchmark for Over-Refusal Evaluation on Text-to-Image Models ( Poster ) > link Link	Ziheng Cheng · Yixiao Huang · Hui Xu · Somayeh Sojoudi · Xuandong Zhao · Dawn Song · Song Mei 🔗
	→ Data Cartography for Detecting Memorization Hotspots and Guiding Data Interventions in Generative Models ( Poster ) > link Link	Laksh Patel · Neel Shanbhag 🔗
	→ Detective SAM: Adapting SAM to Localize Diffusion-based Forgeries via Embedding Artifacts ( Poster ) > link Link	Gert Lek · Chaoyi Zhu · Pin-Yu Chen · Robert Birke · Lydia Y. Chen 🔗
	→ TruthLens: Training-Free Data Verification for Deepfake Images via VQA-style Probing ( Poster ) > link Link	Ritabrata Chakraborty · Rajatsubhra Chakraborty · Ali K. Rahimian 🔗
	→ Layer-wise Influence Tracing: Data-Centric Mitigation of Memorization in Diffusion Models ( Poster ) > link Link	Thomas Chen 🔗
	→ A Representation Engineering Perspective on the Effectiveness of Multi-Turn Jailbreaks ( Poster ) > link Link	11 presenters Blake Bullwinkel · Mark Russinovich · Ahmed Salem · Santiago Zanella-Beguelin · Dan Jones · Giorgio Severi · Eugenia Kim · Keegan Hines · Amanda Minnich · Yonatan Zunger · Ram Shankar Siva Kumar 🔗
	→ Optimal Defenses Against Data Reconstruction Attacks ( Poster ) > link Link	Yuxiao Chen · Gamze Gursoy · Qi Lei 🔗
Sat 3:45 p.m. - 4:15 p.m.	On Specification Data ( Invited Talk ) >	Serena Booth 🔗
Sat 4:15 p.m. - 4:45 p.m.	Panel Discussion ( Panel ) >	Eric Wong · Pin-Yu Chen · Ivan Evtimov · Nouha Dziri · Serena Booth 🔗
Sat 4:45 p.m. - 4:55 p.m.	Best Paper Awards ( Awards ) >	Khoa Doan 🔗
Sat 4:55 p.m. - 5:00 p.m.	Concluding Remarks ( Closing ) >	Khoa Doan 🔗