Skip to yearly menu bar Skip to main content


Poster
in
Workshop: 2nd Workshop on Test-Time Adaptation: Putting Updates to the Test (PUT)

LIFT: Improving Long Context Understanding of Large Language Models through Long Input Fine-Tuning

Yansheng Mao · Yufei Xu · Jiaqi Li · Fanxu Meng · Haotong Yang · Zilong Zheng · Xiyuan Wang · Muhan Zhang

[ ] [ Project Page ]
Fri 18 Jul 11:15 a.m. PDT — noon PDT

Abstract:

Long context understanding remains challenging for large language models due to their limited context windows. This extended abstract presents Long Input Fine-Tuning (LIFT), a novel framework for long-context modeling that can improve the long-context performance of arbitrary short-context LLMs by dynamically adapting model parameters based on the long input. Importantly, LIFT, rather than endlessly extending the context window size to accommodate increasingly longer inputs in context, chooses to store and absorb the long input in parameter. By absorbing the long input into model parameters, LIFT allows short-context LLMs to answer questions even when the required information is not provided in the context during inference. Furthermore, we introduce Gated Memory, a specialized attention adapter that automatically balances long input memorization and the original in-context learning (ICL) capabilities. We provide a comprehensive analysis of the strengths and limitations of LIFT on long context understanding, offering valuable directions for future research.

Chat is not available.