Workshop
Workshop on Technical AI Governance
Ben Bucknall · Lisa Soder · Carlos Mougan · Siddharth Swaroop · Fazl Barez · Anka Reuel · Michael A Osborne · Robert Trager
West Meeting Room 109-110
Sat 19 Jul, 9 a.m. PDT
As the development and use of AI systems expands, policymakers increasingly recognize the need for targeted actions that promote beneficial outcomes while mitigating potential harms. Yet there is often a gap between these policy goals and the technical knowledge required for effective implementation, risking ineffective or actively harmful results (Reuel et al., 2024b). Technical AI governance—a nascent field focused on providing analyses and tools to guide policy decisions and enhance policy implementation—currently lacks sufficient venues for exchanging scholarly work. This workshop aims to provide such a venue, fostering interdisciplinary dialogue between machine learning researchers and policy experts by ensuring each submission is reviewed by both technical and policy specialists. Through this collaboration, we seek to accelerate the development of robust governance strategies that lead to safer, more equitable AI systems.
Schedule
Sat 9:00 a.m. - 9:15 a.m.
|
Introductory Remarks
|
Lisa Soder · Ben Bucknall 🔗 |
Sat 9:15 a.m. - 9:40 a.m.
|
Beyond Pass/Fail: Extracting Behavioral Insights from Large-Scale AI Agent Safety Evaluations
(
Invited Speaker
)
>
|
Cozmin Ududec 🔗 |
Sat 9:45 a.m. - 10:10 a.m.
|
EU Policy for General Purpose AI under the AI Act
(
Invited Speaker
)
>
|
Lucilla Sioli 🔗 |
Sat 10:10 a.m. - 10:20 a.m.
|
Coffee Break
|
🔗 |
Sat 10:20 a.m. - 10:30 a.m.
|
Measuring What Matters: A Framework for Evaluating Safety Risks in Real-World LLM Applications ( Spotlight Talk ) > link | Jia Goh · Shaun Khoo · Gabriel Chua · Leanne Tan · Nyx Iskandar · Jessica Foo 🔗 |
Sat 10:30 a.m. - 10:40 a.m.
|
Trends in AI Supercomputers ( Spotlight Talk ) > link | Konstantin Pilz · James Sanders · Robi Rahman · Lennart Heim 🔗 |
Sat 10:40 a.m. - 10:50 a.m.
|
Deprecating Benchmarks: Criteria and Framework ( Spotlight Talk ) > link | Ayrton San Joaquin · Rokas Gipiškis · Leon Staufer · Ariel Gil 🔗 |
Sat 10:50 a.m. - 11:00 a.m.
|
LLMs Can Covertly Sandbag On Capability Evaluations Against Chain-of-Thought Monitoring ( Spotlight Talk ) > link | Chloe Li · Noah Siegel · Mary Phuong 🔗 |
Sat 11:15 a.m. - 12:15 p.m.
|
Panel Discussion
|
🔗 |
Sat 12:15 p.m. - 2:00 p.m.
|
Extended Lunch Break / Poster Session / Office Hours with UK AISI, EU AI Office, CAISI
|
🔗 |
Sat 2:00 p.m. - 2:25 p.m.
|
In-House Evaluation Is Not Enough: Towards Robust Third-Party Flaw Disclosure for General-Purpose AI
(
Invited Speaker
)
>
|
Shayne Longpre 🔗 |
Sat 2:30 p.m. - 3:00 p.m.
|
Technical AI Governance in Practice: What Tools Miss, and Where We Go Next
(
Invited Speaker
)
>
|
Victor Ojewale 🔗 |
Sat 3:00 p.m. - 4:00 p.m.
|
Coffee Break / Poster Session
|
🔗 |
Sat 4:00 p.m. - 4:10 p.m.
|
Distributed and Decentralised Training: Technical Governance Challenges in a Shifting AI Landscape ( Spotlight Talk ) > link | Jakub Kryś · Yashvardhan Sharma · Janet Egan 🔗 |
Sat 4:10 p.m. - 4:20 p.m.
|
Evaluating LLM Agent Adherence to Hierarchical Principles: A Lightweight Benchmark for Verifying AI Safety Plan Components ( Spotlight Talk ) > link | Ram Potham 🔗 |
Sat 4:20 p.m. - 4:30 p.m.
|
CALMA: Context‑Aligned Axes for Language Model Alignment ( Spotlight Talk ) > link | Prajna Soni · Deepika Raman · Dylan Hadfield-Menell 🔗 |
Sat 4:30 p.m. - 4:40 p.m.
|
Hardware-Enabled Mechanisms for Verifying Responsible AI Development ( Spotlight Talk ) > link | Aidan O'Gara · Gabriel Kulp · Will Hodgkins · James Petrie · Vincent Immler · Aydin Aysu · Kanad Basu · Shivam Bhasin · Stjepan Picek · Ankur Srivastava 🔗 |
Sat 4:50 p.m. - 5:00 p.m.
|
Concluding Remarks
|
Ben Bucknall · Lisa Soder 🔗 |
-
|
From Individual Experience to Collective Evidence: A Reporting-Based Framework for Identifying Systemic Harms ( Poster ) > link | Jessica Dai · Paula Gradu · Inioluwa Raji · Benjamin Recht 🔗 |
-
|
Practical Principles for AI Cost and Compute Accounting ( Poster ) > link | Stephen Casper · Luke Bailey · Tim Schreier 🔗 |
-
|
Guaranteeable Memory: An HBM-Based Chiplet for Verifiable AI Workloads ( Poster ) > link | James Petrie 🔗 |
-
|
Proofs of Autonomy: Scalable and Practical Verification of AI Autonomy ( Poster ) > link | Artem Grigor · Christian Schroeder de Witt · Ivan Martinovic 🔗 |
-
|
Detecting Compute Structuring in AI Governance is likely feasible ( Poster ) > link | Emmanouil Seferis · Timothy Fist 🔗 |
-
|
Compute Requirements for Algorithmic Innovation in Frontier AI Models ( Poster ) > link | Peter Barnett 🔗 |
-
|
Attestable Audits: Verifiable AI Safety Benchmarks Using Trusted Execution Environments ( Poster ) > link | Chris Schnabl · Daniel Hugenroth · Bill Marino · Alastair Beresford 🔗 |
-
|
A Taxonomy for Design and Evaluation of Prompt-Based Natural Language Explanations ( Poster ) > link | Isar Nejadgholi · Mona Omidyeganeh · Marc-Antoine Drouin · Jonathan Boisvert 🔗 |
-
|
Acceleration potential in the GPU design-to-manufacturing pipeline ( Poster ) > link | Maximilian Negele 🔗 |
-
|
Probing Evaluation Awareness of Language Models ( Poster ) > link | Jord Nguyen · Hoang Khiem · Carlo Attubato · Felix Hofstätter 🔗 |
-
|
Fallacies of Data Transparency: Rethinking Nutrition Facts for AI ( Poster ) > link | Judy Hanwen Shen · Ken Ziyu Liu · Angelina Wang · Sarah Cen · Andy Zhang · Caroline Meinhardt · Daniel Zhang · Kevin Klyman · Rishi Bommasani · Daniel Ho 🔗 |
-
|
Exploring an Agenda on Memorization-based Copyright Verification ( Poster ) > link | Harry Jiang · Aster Plotnik · Carlee Joe-Wong 🔗 |
-
|
Robust ML Auditing using Prior Knowledge ( Poster ) > link | Jade Garcia Bourrée · Augustin Godinot · Martijn de Vos · Milos Vujasinovic · Sayan Biswas · Gilles Tredan · Erwan Le Merrer · Anne-Marie Kermarrec 🔗 |
-
|
Scaling Limits to AI Chip Production ( Poster ) > link | Maximilian Negele · Lennart Heim · Peter Ruschhaupt 🔗 |
-
|
The Strong, weak and benign Goodhart's law. An independence-free and paradigm-agnostic formalisation ( Poster ) > link | Adrien Majka · El-Mahdi El-Mhamdi 🔗 |
-
|
Access Controls Will Solve the Dual-Use Dilemma ( Poster ) > link | Evžen Wybitul 🔗 |
-
|
Locking Open Weight Models with Spectral Deformation ( Poster ) > link | Domenic Rosati · Sebastian Dionicio · Xijie Zeng · Subhabrata Majumdar · Frank Rudzicz · Hassan Sajjad 🔗 |
-
|
A Blueprint for a Secure EU AI Audit Ecosystem ( Poster ) > link | Alejandro Tlaie Boria 🔗 |
-
|
Watermarking Without Standards Is Not AI Governance ( Poster ) > link | Alexander Nemecek · Yuzhou Jiang · Erman Ayday 🔗 |
-
|
Methodological Challenges in Agentic Evaluations of AI Systems ( Poster ) > link | Kevin Wei · Stephen Guth · Gabriel Wu · Patricia Paskov 🔗 |
-
|
Trends in Frontier AI Model Count: A Forecast to 2028 ( Poster ) > link | Iyngkarran Kumar · Sam Manning 🔗 |
-
|
Meek Models Shall Inherit The Earth ( Poster ) > link | Hans Gundlach · Jayson Lynch · Neil Thompson 🔗 |
-
|
AI Benchmarks: Interdisciplinary Issues and Policy Considerations ( Poster ) > link | Maria Eriksson · Erasmo Purificato · Arman Noroozian · João Vinagre · Guillaume Chaslot · Emilia Gomez · David Fernández-Llorca 🔗 |
-
|
Position: Generative AI Regulation Can Learn From Social Media Regulation ( Poster ) > link | Ruth Elisabeth Appel 🔗 |
-
|
Reproducibility: The New Frontier in AI Governance ( Poster ) > link | Israel Mason-Williams · Gabryel Mason-Williams 🔗 |
-
|
ExpProof : Operationalizing Explanations for Confidential Models with ZKPs ( Poster ) > link | Chhavi Yadav · Evan Laufer · Dan Boneh · Kamalika Chaudhuri 🔗 |
-
|
Exploring Functional Similarities of Backdoored Models ( Poster ) > link | Yufan Feng · Benjamin Tan · Yani Ioannou 🔗 |
-
|
LibVulnWatch: A Deep Assessment Agent System and Leaderboard for Uncovering Hidden Vulnerabilities in Open-Source AI Libraries ( Poster ) > link | Zekun Wu · Seonglae Cho · Umar Mohammed · CRISTIAN VILLALOBOS · Kleyton Da Costa · Xin Guan · Theo King · Ze Wang · Emre Kazim · Adriano Koshiyama 🔗 |
-
|
Technical Requirements for Halting Dangerous AI Activities ( Poster ) > link | Peter Barnett · Aaron Scher · David Abecassis 🔗 |
-
|
A Conceptual Framework for AI Capability Evaluations ( Poster ) > link | María Carro · Denise Mester · Francisca Selasco · Luca Gangi · Matheo Musa · Lola Pereyra · Mario Leiva · Juan Corvalan · Maria Vanina Martinez · Gerardo Simari 🔗 |
-
|
Relative Bias: A Comparative Approach for Quantifying Bias in LLMs ( Poster ) > link | Alireza Arbabi · Florian Kerschbaum 🔗 |
-
|
Distinguishing Pre-AI and Post-AI Baselines in Marginal Risk Reporting ( Poster ) > link | Jide Alaga · Michael Chen 🔗 |
-
|
In-House Evaluation Is Not Enough: Towards Robust Third-Party Flaw Disclosure for General-Purpose AI ( Poster ) > link |
34 presentersShayne Longpre · Kevin Klyman · Ruth Elisabeth Appel · Sayash Kapoor · Rishi Bommasani · Michelle Sahar · Sean McGregor · Avijit Ghosh · Borhane Blili-Hamelin · Nathan Butters · Alondra Nelson · Amit Elazari · Andrew Sellars · Casey Ellis · Dane Sherrets · Dawn Song · Harley Geiger · Ilona Cohen · Lauren McIlvenny · Madhulika Srikumar · Mark Jaycox · Markus Anderljung · Nadine Johnson · Nicholas Carlini · Nicolas Miailhe · Nik Marda · Peter Henderson · Rebecca Portnoff · Rebecca Weiss · Victoria Westerhoff · Yacine Jernite · Rumman Chowdhury · Percy Liang · Arvind Narayanan |
-
|
Societal Capacity Assessment Framework: Measuring Vulnerability, Resilience, and Transformation from Advanced AI ( Poster ) > link | Milan Gandhi · Peter Cihon · Owen Larter 🔗 |
-
|
Position: Formal Methods are the Principled Foundation of Safe AI ( Poster ) > link | Gagandeep Singh · Deepika Chawla 🔗 |
-
|
Fragile by Design: Formalizing Watermarking Tradeoffs via Paraphrasing ( Poster ) > link | Ali Falahati · Lukasz Golab 🔗 |
-
|
Expert Survey: AI Safety & Security Research Priorities ( Poster ) > link | Joe O'Brien · Jeremy Dolan · Jeba Sania · Jay Kim · Rocio Labrador · Jonah Dykhuizen · Sebastian Becker · Jam Kraprayoon 🔗 |