Skip to yearly menu bar Skip to main content


Invited Talk
in
Workshop: Tiny Titans: The next wave of On-Device Learning for Foundation Models (TTODLer-FM)

Invited Keynote #2

Dan Alistarh

[ ] [ Project Page ]
Fri 18 Jul 10:15 a.m. PDT — 11 a.m. PDT

Abstract:

The last few years have seen an explosion of interest in Ai efficiency. One of the holy grails of the area has been training and inferencing models in end-to-end low-precision, for instance by leveraging the quantized matrix multiplication support on modern GPUs. In this talk, I will present some of our lab’s recent work on this topic, investigating low-precision training of LLMs. Specifically, I will cover a new state-of-the-art algorithm for quantized training called QuEST, discuss the limits of current approaches characterized via scaling laws, and about fast kernel support for low-precision training.

Chat is not available.