Skip to yearly menu bar Skip to main content


Poster
in
Workshop: 2nd Workshop on Test-Time Adaptation: Putting Updates to the Test (PUT)

Keep the Alignment, Skip the Overhead: Lightweight Instruction Alignment for Continually Trained LLMs

Ishan Jindal · Badrinath chandana · Pranjal Bharti · Lakkidi Vinay · SACHIN SHARMA

[ ] [ Project Page ]
Fri 18 Jul 11:15 a.m. PDT — noon PDT

Abstract:

Instruction fine-tuning aligns language models with human intent but is computationally costly. Continuous pretraining on domain-specific data, while effective for adaptation, can degrade instruction-following capabilities. We introduce instruction residuals—the parameter delta between an instruction-tuned model and its base model—as a lightweight mechanism to recover instruction alignment post adaptation. Instruction residuals can be transferred across checkpoints within the same model family, enabling restoration of instruction-following behavior without full retraining. We evaluate our method on LLaMa and Qwen models under domain shifts of up to 1B tokens, showing that instruction residuals effectively preserve alignment while allowing continual domain learning. Our results establish a practical framework for modular, compute-efficient instruction retention in evolving language models.

Chat is not available.