Poster
in
Workshop: 2nd AI for Math Workshop @ ICML 2025
A Survey on Large Language Model Reasoning Failures
Peiyang Song · Pengrui Han · Noah Goodman
Reasoning capabilities in Large Language Models (LLMs) have advanced dramatically, enabling impressive performance across diverse tasks. However, alongside these successes, notable reasoning failures frequently arise, even in seemingly straightforward scenarios. To systematically understand and address these issues, we present a comprehensive survey of reasoning failures in LLMs. We propose a clear categorization framework that divides reasoning failures into embodied and non-embodied types, with non-embodied further subdivided into informal (intuitive) and formal (logical) reasoning. For each category, we synthesize and discuss existing studies, identify common failure patterns, and highlight inspirations for mitigation strategies. Our structured perspective unifies fragmented research efforts, provides deeper insights into systemic weaknesses of current LLMs, and aims to motivate future studies toward more robust, reliable, and human-aligned reasoning capabilities.