ICML Eric Wong - Explanations for Experts via Guarantees and Domain Knowledge: From Attributions to Reasoning

Invited Talk
in
Workshop: Actionable Interpretability

Eric Wong - Explanations for Experts via Guarantees and Domain Knowledge: From Attributions to Reasoning

Eric Wong

[ Abstract ]

Sat 19 Jul 3:30 p.m. PDT — 4 p.m. PDT

Abstract:

"Build it and they will come." After years of research on interpreting ML models, why have domain experts largely stayed away? A major obstacle is one of translation: experts don't understand what to do with ML explanations, as the exact interpretation is often unclear and fails to align with how an expert thinks. This talk introduces two lines of research to make explanations accessible to experts. First, we introduce explanations with certified guarantees for mathematically precise and unambiguous interpretations. Second, we develop benchmarks to quantify the alignment of these explanations with expert knowledge, creating a way to evaluate if they make sense in an expert's domain language. We demonstrate our techniques across applications in healthcare, astrophysics, and psychology, for explanations ranging from classic feature attributions to LLM chain-of-thought reasoning.

Chat is not available.

Invited Talk in Workshop: Actionable Interpretability

Eric Wong - Explanations for Experts via Guarantees and Domain Knowledge: From Attributions to Reasoning

Eric Wong

Invited Talk
in
Workshop: Actionable Interpretability