Workshop on Technical AI Governance

Workshop

Workshop on Technical AI Governance

Ben Bucknall · Lisa Soder · Carlos Mougan · Siddharth Swaroop · Fazl Barez · Anka Reuel · Michael A Osborne · Robert Trager

West Meeting Room 109-110

Sat 19 Jul, 9 a.m. PDT

[ Abstract ]

[ OpenReview]

As the development and use of AI systems expands, policymakers increasingly recognize the need for targeted actions that promote beneficial outcomes while mitigating potential harms. Yet there is often a gap between these policy goals and the technical knowledge required for effective implementation, risking ineffective or actively harmful results (Reuel et al., 2024b). Technical AI governance—a nascent field focused on providing analyses and tools to guide policy decisions and enhance policy implementation—currently lacks sufficient venues for exchanging scholarly work. This workshop aims to provide such a venue, fostering interdisciplinary dialogue between machine learning researchers and policy experts by ensuring each submission is reviewed by both technical and policy specialists. Through this collaboration, we seek to accelerate the development of robust governance strategies that lead to safer, more equitable AI systems.

Chat is not available.

Timezone: America/Los_Angeles

Schedule

Sat 9:00 a.m. - 9:15 a.m.	Introductory Remarks	Lisa Soder · Ben Bucknall 🔗
Sat 9:15 a.m. - 9:40 a.m.	Beyond Pass/Fail: Extracting Behavioral Insights from Large-Scale AI Agent Safety Evaluations ( Invited Speaker ) >	Cozmin Ududec 🔗
Sat 9:45 a.m. - 10:10 a.m.	EU Policy for General Purpose AI under the AI Act ( Invited Speaker ) >	Lucilla Sioli 🔗
Sat 10:10 a.m. - 10:20 a.m.	Coffee Break	🔗
Sat 10:20 a.m. - 10:30 a.m.	Measuring What Matters: A Framework for Evaluating Safety Risks in Real-World LLM Applications ( Spotlight Talk ) > link Link	Jia Goh · Shaun Khoo · Gabriel Chua · Leanne Tan · Nyx Iskandar · Jessica Foo 🔗
Sat 10:30 a.m. - 10:40 a.m.	Trends in AI Supercomputers ( Spotlight Talk ) > link Link	Konstantin Pilz · James Sanders · Robi Rahman · Lennart Heim 🔗
Sat 10:40 a.m. - 10:50 a.m.	Deprecating Benchmarks: Criteria and Framework ( Spotlight Talk ) > link Link	Ayrton San Joaquin · Rokas Gipiškis · Leon Staufer · Ariel Gil 🔗
Sat 10:50 a.m. - 11:00 a.m.	LLMs Can Covertly Sandbag On Capability Evaluations Against Chain-of-Thought Monitoring ( Spotlight Talk ) > link Link	Chloe Li · Noah Siegel · Mary Phuong 🔗
Sat 11:15 a.m. - 12:15 p.m.	Panel Discussion	🔗
Sat 12:15 p.m. - 2:00 p.m.	Extended Lunch Break / Poster Session / Office Hours with UK AISI, EU AI Office, CAISI	🔗
Sat 2:00 p.m. - 2:25 p.m.	In-House Evaluation Is Not Enough: Towards Robust Third-Party Flaw Disclosure for General-Purpose AI ( Invited Speaker ) >	Shayne Longpre 🔗
Sat 2:30 p.m. - 3:00 p.m.	Technical AI Governance in Practice: What Tools Miss, and Where We Go Next ( Invited Speaker ) >	Victor Ojewale 🔗
Sat 3:00 p.m. - 4:00 p.m.	Coffee Break / Poster Session	🔗
Sat 4:00 p.m. - 4:10 p.m.	Distributed and Decentralised Training: Technical Governance Challenges in a Shifting AI Landscape ( Spotlight Talk ) > link Link	Jakub Kryś · Yashvardhan Sharma · Janet Egan 🔗
Sat 4:10 p.m. - 4:20 p.m.	Evaluating LLM Agent Adherence to Hierarchical Principles: A Lightweight Benchmark for Verifying AI Safety Plan Components ( Spotlight Talk ) > link Link	Ram Potham 🔗
Sat 4:20 p.m. - 4:30 p.m.	CALMA: Context‑Aligned Axes for Language Model Alignment ( Spotlight Talk ) > link Link	Prajna Soni · Deepika Raman · Dylan Hadfield-Menell 🔗
Sat 4:30 p.m. - 4:40 p.m.	Hardware-Enabled Mechanisms for Verifying Responsible AI Development ( Spotlight Talk ) > link Link	Aidan O'Gara · Gabriel Kulp · Will Hodgkins · James Petrie · Vincent Immler · Aydin Aysu · Kanad Basu · Shivam Bhasin · Stjepan Picek · Ankur Srivastava 🔗
Sat 4:50 p.m. - 5:00 p.m.	Concluding Remarks	Ben Bucknall · Lisa Soder 🔗
-	From Individual Experience to Collective Evidence: A Reporting-Based Framework for Identifying Systemic Harms ( Poster ) > link Link	Jessica Dai · Paula Gradu · Inioluwa Raji · Benjamin Recht 🔗
-	Practical Principles for AI Cost and Compute Accounting ( Poster ) > link Link	Stephen Casper · Luke Bailey · Tim Schreier 🔗
-	Guaranteeable Memory: An HBM-Based Chiplet for Verifiable AI Workloads ( Poster ) > link Link	James Petrie 🔗
-	Proofs of Autonomy: Scalable and Practical Verification of AI Autonomy ( Poster ) > link Link	Artem Grigor · Christian Schroeder de Witt · Ivan Martinovic 🔗
-	Detecting Compute Structuring in AI Governance is likely feasible ( Poster ) > link Link	Emmanouil Seferis · Timothy Fist 🔗
-	Compute Requirements for Algorithmic Innovation in Frontier AI Models ( Poster ) > link Link	Peter Barnett 🔗
-	Attestable Audits: Verifiable AI Safety Benchmarks Using Trusted Execution Environments ( Poster ) > link Link	Chris Schnabl · Daniel Hugenroth · Bill Marino · Alastair Beresford 🔗
-	A Taxonomy for Design and Evaluation of Prompt-Based Natural Language Explanations ( Poster ) > link Link	Isar Nejadgholi · Mona Omidyeganeh · Marc-Antoine Drouin · Jonathan Boisvert 🔗
-	Acceleration potential in the GPU design-to-manufacturing pipeline ( Poster ) > link Link	Maximilian Negele 🔗
-	Probing Evaluation Awareness of Language Models ( Poster ) > link Link	Jord Nguyen · Hoang Khiem · Carlo Attubato · Felix Hofstätter 🔗
-	Fallacies of Data Transparency: Rethinking Nutrition Facts for AI ( Poster ) > link Link	Judy Hanwen Shen · Ken Ziyu Liu · Angelina Wang · Sarah Cen · Andy Zhang · Caroline Meinhardt · Daniel Zhang · Kevin Klyman · Rishi Bommasani · Daniel Ho 🔗
-	Exploring an Agenda on Memorization-based Copyright Verification ( Poster ) > link Link	Harry Jiang · Aster Plotnik · Carlee Joe-Wong 🔗
-	Robust ML Auditing using Prior Knowledge ( Poster ) > link Link	Jade Garcia Bourrée · Augustin Godinot · Martijn de Vos · Milos Vujasinovic · Sayan Biswas · Gilles Tredan · Erwan Le Merrer · Anne-Marie Kermarrec 🔗
-	Scaling Limits to AI Chip Production ( Poster ) > link Link	Maximilian Negele · Lennart Heim · Peter Ruschhaupt 🔗
-	The Strong, weak and benign Goodhart's law. An independence-free and paradigm-agnostic formalisation ( Poster ) > link Link	Adrien Majka · El-Mahdi El-Mhamdi 🔗
-	Access Controls Will Solve the Dual-Use Dilemma ( Poster ) > link Link	Evžen Wybitul 🔗
-	Locking Open Weight Models with Spectral Deformation ( Poster ) > link Link	Domenic Rosati · Sebastian Dionicio · Xijie Zeng · Subhabrata Majumdar · Frank Rudzicz · Hassan Sajjad 🔗
-	A Blueprint for a Secure EU AI Audit Ecosystem ( Poster ) > link Link	Alejandro Tlaie Boria 🔗
-	Watermarking Without Standards Is Not AI Governance ( Poster ) > link Link	Alexander Nemecek · Yuzhou Jiang · Erman Ayday 🔗
-	Methodological Challenges in Agentic Evaluations of AI Systems ( Poster ) > link Link	Kevin Wei · Stephen Guth · Gabriel Wu · Patricia Paskov 🔗
-	Trends in Frontier AI Model Count: A Forecast to 2028 ( Poster ) > link Link	Iyngkarran Kumar · Sam Manning 🔗
-	Meek Models Shall Inherit The Earth ( Poster ) > link Link	Hans Gundlach · Jayson Lynch · Neil Thompson 🔗
-	AI Benchmarks: Interdisciplinary Issues and Policy Considerations ( Poster ) > link Link	Maria Eriksson · Erasmo Purificato · Arman Noroozian · João Vinagre · Guillaume Chaslot · Emilia Gomez · David Fernández-Llorca 🔗
-	Position: Generative AI Regulation Can Learn From Social Media Regulation ( Poster ) > link Link	Ruth Elisabeth Appel 🔗
-	Reproducibility: The New Frontier in AI Governance ( Poster ) > link Link	Israel Mason-Williams · Gabryel Mason-Williams 🔗
-	ExpProof : Operationalizing Explanations for Confidential Models with ZKPs ( Poster ) > link Link	Chhavi Yadav · Evan Laufer · Dan Boneh · Kamalika Chaudhuri 🔗
-	Exploring Functional Similarities of Backdoored Models ( Poster ) > link Link	Yufan Feng · Benjamin Tan · Yani Ioannou 🔗
-	LibVulnWatch: A Deep Assessment Agent System and Leaderboard for Uncovering Hidden Vulnerabilities in Open-Source AI Libraries ( Poster ) > link Link	Zekun Wu · Seonglae Cho · Umar Mohammed · CRISTIAN VILLALOBOS · Kleyton Da Costa · Xin Guan · Theo King · Ze Wang · Emre Kazim · Adriano Koshiyama 🔗
-	Technical Requirements for Halting Dangerous AI Activities ( Poster ) > link Link	Peter Barnett · Aaron Scher · David Abecassis 🔗
-	A Conceptual Framework for AI Capability Evaluations ( Poster ) > link Link	María Carro · Denise Mester · Francisca Selasco · Luca Gangi · Matheo Musa · Lola Pereyra · Mario Leiva · Juan Corvalan · Maria Vanina Martinez · Gerardo Simari 🔗
-	Relative Bias: A Comparative Approach for Quantifying Bias in LLMs ( Poster ) > link Link	Alireza Arbabi · Florian Kerschbaum 🔗
-	Distinguishing Pre-AI and Post-AI Baselines in Marginal Risk Reporting ( Poster ) > link Link	Jide Alaga · Michael Chen 🔗
-	In-House Evaluation Is Not Enough: Towards Robust Third-Party Flaw Disclosure for General-Purpose AI ( Poster ) > link Link	34 presenters Shayne Longpre · Kevin Klyman · Ruth Elisabeth Appel · Sayash Kapoor · Rishi Bommasani · Michelle Sahar · Sean McGregor · Avijit Ghosh · Borhane Blili-Hamelin · Nathan Butters · Alondra Nelson · Amit Elazari · Andrew Sellars · Casey Ellis · Dane Sherrets · Dawn Song · Harley Geiger · Ilona Cohen · Lauren McIlvenny · Madhulika Srikumar · Mark Jaycox · Markus Anderljung · Nadine Johnson · Nicholas Carlini · Nicolas Miailhe · Nik Marda · Peter Henderson · Rebecca Portnoff · Rebecca Weiss · Victoria Westerhoff · Yacine Jernite · Rumman Chowdhury · Percy Liang · Arvind Narayanan 🔗
-	Societal Capacity Assessment Framework: Measuring Vulnerability, Resilience, and Transformation from Advanced AI ( Poster ) > link Link	Milan Gandhi · Peter Cihon · Owen Larter 🔗
-	Position: Formal Methods are the Principled Foundation of Safe AI ( Poster ) > link Link	Gagandeep Singh · Deepika Chawla 🔗
-	Fragile by Design: Formalizing Watermarking Tradeoffs via Paraphrasing ( Poster ) > link Link	Ali Falahati · Lukasz Golab 🔗
-	Expert Survey: AI Safety & Security Research Priorities ( Poster ) > link Link	Joe O'Brien · Jeremy Dolan · Jeba Sania · Jay Kim · Rocio Labrador · Jonah Dykhuizen · Sebastian Becker · Jam Kraprayoon 🔗