Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Methods and Opportunities at Small Scale (MOSS)

Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought

Hanlin Zhu · Shibo Hao · Zhiting Hu · Jiantao Jiao · Stuart Russell · Yuandong Tian

Keywords: [ transformer ] [ reasoning ] [ chain of continuous thought ] [ superposition ]


Abstract: In this paper, we prove that a two-layer transformer with $D$ steps of continuous chain-of-thoughts (CoTs) can solve the directed graph reachability problem, where $D$ is the diameter of the graph, while the best known result of constant-depth transformers with discrete CoTs requires $O(n^2)$ decoding steps where $n$ is the number of vertices ($D

Chat is not available.