Poster
in
Workshop: CODEML: Championing Open-source DEvelopment in Machine Learning
Liger-Kernel: Efficient Triton Kernels for LLM Training
Pin-Lun Hsu · Yun Dai · Vignesh Kothapalli · Qingquan Song · Shao Tang · Siyu Zhu · Steven Shimizu · Shivam Sahni · Haowen Ning · Yanning Chen · Zhipeng Wang
Abstract:
Training large language models (LLMs) efficiently at scale remains challenging due to rising compute and memory demands. We present Liger‑Kernel, an open‑source Triton kernel suite for core LLM primitives and diverse loss functions (pre-training, SFT, distillation, alignment, RLHF). Each kernel uses aggressive operator fusion, in‑place gradient computation, and, where advantageous, input chunking to curb memory traffic and kernel‑launch overhead. On widely used LLMs, these optimizations boost throughput by ~20% and cut GPU memory consumption by ~60% versus Hugging Face baselines. The code is available under a permissive license at https://anonymous.4open.science/r/Liger-Kernel-E8D9.
Chat is not available.