ICML Mirage of Mastery: Memorization Tricks LLMs into Artificially Inflated Self-Knowledge

Poster
in
Workshop: The Impact of Memorization on Trustworthy Foundation Models

Mirage of Mastery: Memorization Tricks LLMs into Artificially Inflated Self-Knowledge

Sahil Kale

[ Abstract ] [ Project Page ]

[ OpenReview]

Sat 19 Jul 8:30 a.m. PDT — 9:30 a.m. PDT

Abstract:

When artificial intelligence mistakes memorization for intelligence, it creates a dangerous perception of reasoning. Existing studies treat memorization and self-knowledge deficits in LLMs as separate issues. In our study, we pinpoint an intertwined causal link between the two that undermines the trustworthiness of LLM responses. To investigate this, we utilize a novel framework to ascertain if LLMs genuinely learn reasoning patterns from training data or merely memorize them to assume competence across problems of similar complexity focused on STEM domains. Our analysis shows a noteworthy problem in generalization: LLMs draw confidence from memorized solutions to infer a higher self-knowledge about their reasoning ability, which manifests as an over 45% inconsistency in feasibility assessments when faced with self-validated, logically coherent task perturbations. This effect is most pronounced in science and medicine domains. Our code and results are available publicly at https://anonymous.4open.science/r/LLM-MemorizationSKEval--543D/.

Chat is not available.

Poster in Workshop: The Impact of Memorization on Trustworthy Foundation Models

Mirage of Mastery: Memorization Tricks LLMs into Artificially Inflated Self-Knowledge

Sahil Kale

Poster
in
Workshop: The Impact of Memorization on Trustworthy Foundation Models