Poster
in
Workshop: ES-FoMo III: 3rd Workshop on Efficient Systems for Foundation Models
An Efficient Row-Based Sparse Fine-Tuning with Low Quantization Error
Cen-Jhih Li · Aditya Bhaskara
Abstract:
Fine-tuning is essential for adapting large language models to downstream tasks, but can be costly for users with limited resources. To address this, Sparse Fine-tuning (SpFT) and Low-rank adaptation (LoRA) have been widely adopted for efficient fine-tuning. In this work, we propose a new SpFT framework inspired by neural network pruning: we identify important neurons using structural pruning and fine-tune only the associated weights. Experiments on common language tasks show our method improves SpFT’s memory efficiency by 20–50\% while matching the accuracy of state-of-the-art methods like LoRA's variants.
Chat is not available.