ICML Rethinking Memorization Measures in LLMs: Recollection vs. Counterfactual vs. Contextual Memorization

Poster
in
Workshop: The Impact of Memorization on Trustworthy Foundation Models

Rethinking Memorization Measures in LLMs: Recollection vs. Counterfactual vs. Contextual Memorization

Bishwamittra Ghosh · Soumi Das · Qinyuan Wu · Mohammad Aflah Khan · Krishna Gummadi · Evimaria Terzi · Deepak Garg

[ Abstract ] [ Project Page ]

[ OpenReview]

Sat 19 Jul 8:30 a.m. PDT — 9:30 a.m. PDT

Abstract:

Memorization in large language models (LLMs) is often viewed as undesirable for learning. Existing memorization measures largely focus on quantifying privacy risks, rather than capturing the underlying phenomenon of memorization itself. To address this gap, we introduce contextual memorization, which disentangles memorization from contextual learning -- both of which LLMs perform during training. We further show that different measures -- recollection-based, counterfactual, and contextual -- can yield contradictory implications when applied to the same training dynamics, such as disagreeing on the order of memorization of strings of varying frequencies.

Chat is not available.

Poster in Workshop: The Impact of Memorization on Trustworthy Foundation Models

Rethinking Memorization Measures in LLMs: Recollection vs. Counterfactual vs. Contextual Memorization

Bishwamittra Ghosh · Soumi Das · Qinyuan Wu · Mohammad Aflah Khan · Krishna Gummadi · Evimaria Terzi · Deepak Garg

Poster
in
Workshop: The Impact of Memorization on Trustworthy Foundation Models