Poster
in
Workshop: Actionable Interpretability
Probabilistic Soundness Guarantees in LLM Reasoning Chains
Weiqiu You · Anton Xue · Shreya Havaldar · Delip Rao · Helen Jin · Chris Callison-Burch · Eric Wong
Large Language Models (LLMs) often generate errors in their reasoning chains that can propagate and complicate checking the correctness of intermediate claims. Current LLM-based error detection methods usually take in the full reasoning chain as the context and output a score for each step. However, the model can be misled when there are incorrect steps in the context, and these errors are propagated to later steps. To address this problem, we leverage how humans typically check the soundness of claims in a reasoning chain, and introduce Reasoning Entailment Stability (RES), a novel probabilistic framework that inductively judges each step in a reasoning chain based solely on the previously validated claims. RES achieves 72.1% F1 (+8.2 points) across four benchmarks and 90.3% F1 (+27.6 points) on our controllable dataset with long reasoning chains.