Poster
Sketch to Adapt: Fine-Tunable Sketches for Efficient LLM Adaptation
Tianyi Zhang · Junda Su · Aditya Desai · Oscar Wu · Zhaozhuo Xu · Anshumali Shrivastava
East Exhibition Hall A-B #E-2010
Large language models (LLMs), such as those used in chatbots and search engines, require significant computer resources because of their enormous size. This creates challenges when adapting these models to new tasks, as updating or fine-tuning them is often slow and memory-intensive.Our work introduces SketchTune, a new technique that first compresses the LLM into a much smaller, "sketched" version. Unlike most compression methods, SketchTune makes this compressed model fully trainable, so it can still be adapted to new tasks. Instead of updating all the original model's parameters, SketchTune allows fine-tuning by modifying only a small set of parameters within the compressed model.We show that models compressed and adapted with SketchTune can achieve similar or even better performance compared to traditional methods, all while using much less memory and computational power. This approach makes it easier and more efficient for a wider range of people and organizations to use and customize powerful LLMs.