Poster
Ehrenfeucht-Haussler Rank and Chain of Thought
Pablo Barcelo · Alexander Kozachinskiy · Tomasz Steifer
West Exhibition Hall B2-B3 #W-903
The ability of Transformers to perform function composition has garnered increasing attention in recent years, as understanding this capability sheds light on the computational resources they require to infer implicit knowledge from a given set of facts. Peng et al. demonstrated that single-layer, soft-attention Transformers without Chain-of-Thought (CoT) reasoning are fundamentally incapable of function composition. However, when CoT is introduced, they can achieve iterated composition—albeit at the cost of requiring a growing number of steps, which depends on both vector dimensionality and feature precision. Our work precisely quantifies the number of steps needed for t-th iterated composition and establishes that, under the idealized assumption of hard-attention, the number of required CoT steps is exactly t. This finding underscores a key insight: while CoT enables function composition, it does so incrementally—one step at a time.