ICML Distinguishing Pre-AI and Post-AI Baselines in Marginal Risk Reporting

Poster
in
Workshop: Workshop on Technical AI Governance

Distinguishing Pre-AI and Post-AI Baselines in Marginal Risk Reporting

Jide Alaga · Michael Chen

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

Major developers of large language models make development and deployment decisions informed by evaluations of "marginal risk": risk introduced by a new AI model, relative to a baseline. Developers face a critical choice between two types of baselines: a "pre-AI" baseline without modern AI systems (e.g., only having 2023-level technology), or a "post-AI" baseline which includes the most risk-enabling models already available. Reviewing frontier safety policies adopted by AI developers, we note that developers are inconsistent in specifying which baseline is being used. We examine potential risks of shifting toward post-AI baselines, which can lead to an environment in which each individual model may appear incrementally safe, while cumulative risks from AI development collectively become unacceptably dangerous. We recommend that AI developers clarify whether a pre-AI or post-AI baseline is used, increase transparency around risk factors, and prepare to reduce aggregate risk by the time risk is unacceptably high.

Chat is not available.

Poster in Workshop: Workshop on Technical AI Governance

Distinguishing Pre-AI and Post-AI Baselines in Marginal Risk Reporting

Jide Alaga · Michael Chen

Poster
in
Workshop: Workshop on Technical AI Governance