Skip to yearly menu bar Skip to main content


Spotlight Poster

Position: We Can’t Understand AI Using our Existing Vocabulary

John Hewitt · Robert Geirhos · Been Kim

East Exhibition Hall A-B #E-500
[ ] [ ]
Wed 16 Jul 4:30 p.m. PDT — 7 p.m. PDT

Abstract:

This position paper argues that, in order to understand AI, we cannot rely on our existing vocabulary of human words. Instead, we shouldstrive to develop neologisms: new words thatrepresent precise human concepts that we wantto teach machines, or machine concepts that weneed to learn. We start from the premise thathumans and machines have differing concepts.This means interpretability can be framed as acommunication problem: humans must be able toreference and control machine concepts, and communicate human concepts to machines. Creatinga shared human-machine language through developing neologisms, we believe, could solve thiscommunication problem. Successful neologismsachieve a useful amount of abstraction: not toodetailed, so they’re reusable in many contexts, andnot too high-level, so they convey precise information. As a proof of concept, we demonstrate howa “length neologism” enables controlling LLMresponse length, while a “diversity neologism” allows sampling more variable responses. Takentogether, we argue that we cannot understand AIusing our existing vocabulary, and expanding itthrough neologisms creates opportunities for bothcontrolling and understanding machines better.

Lay Summary:

Understanding AI systems is a critical problem as they become increasingly deployed in the world. We frame the problem of understanding and controlling AI systems as a communication problem, like how we try to communicate complex concepts between humans. In this opinion piece, we argue that we need to develop new words for ways in which AI systems see the world, and new words that teach AI systems how we see the world, in order to achieve this communication. Akin to how humans invent new words to discuss new or complex things, we need to do this with AI systems. We provide arguments for this position and early experiments showcasing how we might accomplish this in the future.

Chat is not available.