Poster
LIFT the Veil for the Truth: Principal Weights Emerge after Rank Reduction for Reasoning-Focused Supervised Fine-Tuning
Zihang Liu · Tianyu Pang · Oleg Balabanov · Chaoqun Yang · Tianjin Huang · Lu Yin · Yaoqing Yang · Shiwei Liu
East Exhibition Hall A-B #E-2604
Modern language-based AI models learn to “reason” by adapting its weights to complex tasks through fine-tuning. As AI models continue to grow in size to billion-parameter level, it becomes crucial to develop a fine-tuning method that has both superior performance and better efficiency. To solve this problem, in this paper we study a method called sparse fine-tuning, which only changes a tiny subset of model weights. An important problem for sparse fine-tuning is to find the critical subset of weights. We find that the critical components for fine-tuning can be characterized by the top eigenspace of the weight matrix. Our study reveals that model weights that have the largest magnitude after performing low-rank approximation are the Principal Weights critical to fine-tuning. We then design a method that only fine-tunes the Principal Weights, and name this method Low-rank Informed Sparse Fine-Tuning (LIFT).From empirical studies, we found that LIFT achieves stronger results on reasoning tasks than dense fine-tuning method, while better preserving the knowledge the model already has. Furthermore, the memory overhead of LIFT is significantly lower than dense fine-tuning, comparable to the best efficient fine-tuning methods.By “lifting the veil” with low-rank approximation and fine-tuning the largest-magnitude weights, LIFT finds the “truth” within model weights that are critical to fine-tuning. This work provides insights for determining the critical components of model weights, and inspires future research on designing more efficient fine-tuning approaches that improve the reasoning ability of large AI models.