Skip to yearly menu bar Skip to main content


Spotlight Poster

ActionPiece: Contextually Tokenizing Action Sequences for Generative Recommendation

Yupeng Hou · Jianmo Ni · Zhankui He · Noveen Sachdeva · Wang-Cheng Kang · Ed Chi · Julian McAuley · Derek Cheng

East Exhibition Hall A-B #E-2209
[ ] [ ]
Thu 17 Jul 4:30 p.m. PDT — 7 p.m. PDT

Abstract:

Generative recommendation (GR) is an emerging paradigm where user actions are tokenized into discrete token patterns and autoregressively generated as predictions. However, existing GR models tokenize each action independently, assigning the same fixed tokens to identical actions across all sequences without considering contextual relationships. This lack of context-awareness can lead to suboptimal performance, as the same action may hold different meanings depending on its surrounding context. To address this issue, we propose ActionPiece to explicitly incorporate context when tokenizing action sequences. In ActionPiece, each action is represented as a set of item features. Given the action sequence corpora, we construct the vocabulary by merging feature patterns as new tokens, based on their co-occurrence frequency both within individual sets and across adjacent sets. Considering the unordered nature of feature sets, we further introduce set permutation regularization, which produces multiple segmentations of action sequences with the same semantics. Our code is available at: https://github.com/google-deepmind/action_piece.

Lay Summary:

Many online services, like streaming platforms or shopping websites, recommend what you might want to do next based on your past actions, such as what you watched or bought. Recent AI systems try to generate these recommendations by learning from patterns in user activity, where each action (like watching a movie) is turned into a string of symbols the machine can understand. However, current methods always use the same symbols for the same action, no matter what happened before or after. This ignores context, for example, watching a family movie after a string of horror films might mean something different than watching it after cartoons. Our work, ActionPiece, is the first to make this process context-aware. It converts user actions into symbol sequences that adapt depending on surrounding actions, capturing richer meaning. This helps AI models better understand what people are doing and why, leading to smarter personalized recommendations.

Chat is not available.