Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Assessing World Models: Methods and Metrics for Evaluating Understanding

Probing the Limits of Mathematical World Models in LLMs

Henry Kvinge · Elizabeth Coda · Eric Yeats · Davis Brown · John Buckheit · Sarah Scullen · Brendan Kennedy · Loc Truong · William Kay · Cliff Joslyn · Tegan Emerson · Michael Henry · John Emanuello

Keywords: [ Mathematical world models ] [ Linear probing ] [ representation geometry and topology ]


Abstract: There are now many studies supporting the idea that even when they are trained on a broad corpus of textual data scraped from the internet, large language models (LLMs) are (sporadically) capable of non-trivial mathematical tasks. This observation and a collection of studies from the interpretability community together suggest that LLMs extract surprisingly rich internal representations of mathematical objects. In this paper we ask the extent to which LLMs contain mathematical 'world models' that align with the way that mathematicians understand and think about mathematics. We focus on simple binary operations $\star: X \times X \rightarrow X$ like addition and multiplication which take two inputs $a$ and $b$ from a space $X$ and produce a third element $a \star b = c$. Instead of assessing the correctness of the LLM response, we explore the extent to which the model captures the geometric structure of $X$, simple number-theoretic properties of $a$ and $b$, and the algebraic properties of $\star$. We report mixed results. While the LLMs we tested tended to store substantial amounts of information (such as the divisibility properties of integers $a$ and $b$ in the expression $a \times b$) and sometimes extracted representations that aligned with existing mathematical structures (reconstructing a patch of $\mathbb{R}^2$ for example), these representations tended to be local in nature and lack robustness.

Chat is not available.