ICML CALMA: Context‑Aligned Axes for Language Model Alignment

Spotlight Talk
in
Workshop: Workshop on Technical AI Governance

CALMA: Context‑Aligned Axes for Language Model Alignment

Prajna Soni · Deepika Raman · Dylan Hadfield-Menell

[ Abstract ] [ Project Page ]

[ OpenReview]

Sat 19 Jul 4:20 p.m. PDT — 4:30 p.m. PDT

Abstract:

Datasets play a central role in AI governance by enabling both evaluation (measuring capabilities) and alignment (enforcing values) along axes such as helpfulness, harmlessness, toxicity, quality, and more. However, most alignment and evaluation datasets depend on researcher‑ or developer‑defined axes curated from non‑representative samples, benchmarking models against broad (often Western‑centric) values that overlook the varied contexts of their real‑world deployment. Consequently, models trained on such proxies can fail to meet the specific needs and expectations of diverse user communities within these contexts. To bridge this gap, we introduce CALMA (Context-aligned Axes for Language Model Alignment), a grounded participatory methodology for eliciting context‑relevant axes for evaluation and alignment. In a pilot with two distinct communities, CALMA surfaced novel priorities that are absent from standard benchmarks. Our findings underscore the importance of grounding evaluation practices in open‑ended, use‑case‑driven processes to ensure that language models are evaluated and aligned to criteria that truly reflect the environments in which they operate.

Chat is not available.

Spotlight Talk in Workshop: Workshop on Technical AI Governance

CALMA: Context‑Aligned Axes for Language Model Alignment

Prajna Soni · Deepika Raman · Dylan Hadfield-Menell

Spotlight Talk
in
Workshop: Workshop on Technical AI Governance