Workshop
Tiny Titans: The next wave of On-Device Learning for Foundation Models (TTODLer-FM)
Stefanos Laskaridis · Samuel Horváth · Berivan Isik · Peter Kairouz · Bilge Acun · Christina Giannoula · Angelos Katharopoulos · Martin Takac · Nicholas Lane
West Meeting Room 215-216
Fri 18 Jul, 8:30 a.m. PDT
The rapid evolution of Deep Learning, propelled by transformer-based architectures and significant hardware advancements, has unlocked unprecedented capabilities across diverse domains, from biological sciences to autonomous systems. As foundation models continue to scale, they introduce new challenges in resource management, particularly in data centers, and data availability prompting us to broaden our exploration of leveraging distributed and on-device resources for training and inference. Small Language Models (SLMs) are emerging as a compelling alternative for generative AI, particularly at the edge, offering a sustainable balance between efficiency and user privacy. This workshop aims to bring together algorithms and systems experts to discuss the opportunities and challenges of on-device machine learning. We hope to explore to what extent SLMs can compete with or complement LLMs and identify methods to enhance their quality and efficiency. Addressing this shift requires innovation in algorithm and system co-design, underscoring the importance of interdisciplinary approaches for future applications.
Schedule
Fri 8:30 a.m. - 8:45 a.m.
|
Introduction from Organizers
(
Intro
)
>
|
Stefanos Laskaridis 🔗 |
Fri 8:45 a.m. - 9:30 a.m.
|
Invited Keynote #1 ( Invited Talk ) > link | Zechun Liu 🔗 |
Fri 9:30 a.m. - 9:45 a.m.
|
Too Big to Think: Capacity, Memorization, and Generalization in Pre-Trained Transformers ( Oral ) > link | Joshua Barron · Devin White 🔗 |
Fri 9:45 a.m. - 10:00 a.m.
|
Kinetics: Rethinking Test-Time Scaling Laws ( Oral ) > link | Ranajoy Sadhukhan · Zhuoming Chen · Haizhong Zheng · Yang Zhou · Emma Strubell · Beidi Chen 🔗 |
Fri 10:00 a.m. - 10:15 a.m.
|
Coffee break
|
🔗 |
Fri 10:15 a.m. - 11:00 a.m.
|
Invited Keynote #2 ( Invited Talk ) > link | Dan Alistarh 🔗 |
Fri 11:00 a.m. - 11:15 a.m.
|
Preserve then Quantize: Dominant-Subspace Guided Low-Rank Reconstruction ( Oral ) > link | Yoonjun Cho · Dongjae Jeon · Soeun Kim · Albert No 🔗 |
Fri 11:15 a.m. - 11:30 a.m.
|
WhisperKit: On-device Real-time ASR with Billion-Scale Transformers ( Oral ) > link | Berkin Durmus · Arda Okan · Eduardo Pacheco · Zach Nagengast · Atila Orhon 🔗 |
Fri 11:30 a.m. - 12:15 p.m.
|
Invited Keynote #3
(
Invited Talk
)
>
|
Song Han 🔗 |
Fri 1:00 p.m. - 1:45 p.m.
|
Randomized Asymmetric Chain of LoRA: The First Meaningful Theoretical Framework for Low-Rank Adaptation for Federated Learning ( Poster ) > link | Grigory Malinovsky · Umberto Michieli · Hasan Hammoud · Taha Ceritli · Hayder Elesedy · Mete Ozay · Peter Richtarik 🔗 |
Fri 1:00 p.m. - 1:45 p.m.
|
Unlocking the Potential of Extremely Low-Bit Sparse Transformers through Adaptive Multi-bit Supermasks and Random Weights ( Poster ) > link | Yasuyuki Okoshi · Hikari Otsuka · Junnosuke Suzuki · Daichi Fujiki · Masato Motomura 🔗 |
Fri 1:00 p.m. - 1:45 p.m.
|
Addition is almost all you need: Compressing neural networks with double binary factorization ( Poster ) > link | Vladimír Boža · Vladimír Macko 🔗 |
Fri 1:00 p.m. - 1:45 p.m.
|
Capability Transfer from Large to Small Models with Synthetically-Generated Data ( Poster ) > link | Lillian Sun · Emma Yang · Arif Dayi 🔗 |
Fri 1:00 p.m. - 1:45 p.m.
|
TensorSLM: Energy-efficient Embedding Compression of Sub-billion Parameter Language Models on Low-end Devices ( Poster ) > link | Mingxue Xu · Yao Lei Xu · Danilo Mandic 🔗 |
Fri 1:00 p.m. - 1:45 p.m.
|
Leveraging Coordinate Momentum in SignSGD and Muon: Memory-Optimized Zero-Order LLM Fine-Tuning ( Poster ) > link | Egor Petrov · Evseev Grigoriy · Aleksey Antonov · Andrey Veprikov · Pavel Plyusnin · Nikolay Bushkov · Stanislav Moiseev · Aleksandr Beznosikov 🔗 |
Fri 1:00 p.m. - 1:45 p.m.
|
Higher Acceptance Rates for Speculative Decoding with Randomised Drafting ( Poster ) > link | William Toner · Martin Asenov · Rajkarn Singh · Artjom Joosen 🔗 |
Fri 1:00 p.m. - 1:45 p.m.
|
Zoop it! Efficient Zero-Order Optimization with Output Perturbation ( Poster ) > link | Xixi Hu · Bo Liu · qiang liu · Xiaocong Du · Bhargav Bhushanam · Louis Feng · Chengyue Gong · Kaizhao Liang 🔗 |
Fri 1:00 p.m. - 1:45 p.m.
|
MatMuls are Enough for Efficient and Performant Linear-Time Attention ( Poster ) > link | Andrew Argatkiny · Ilya Makarov 🔗 |
Fri 1:00 p.m. - 1:45 p.m.
|
Zeroth-Order Optimization is Secretly Single-Step Policy Optimization ( Poster ) > link | Junbin Qiu · Zhengpeng Xie · Xiangda Yan · Yongjie Yang · Yao Shu 🔗 |
Fri 1:00 p.m. - 1:45 p.m.
|
Spec-LLaVA: Accelerating Vision-Language Models with Dynamic Tree-Based Speculative Decoding ( Poster ) > link | Mingxiao Huo · Jiayi Zhang · Hewei Wang · Jinfeng Xu · Zheyu Chen · Huilin Tai · Ian Chen 🔗 |
Fri 1:00 p.m. - 1:45 p.m.
|
LoFT: Low-Rank Adaptation That Behaves Like Full Fine-Tuning ( Poster ) > link | Nurbek Tastan · Stefanos Laskaridis · Martin Takac · Karthik Nandakumar · Samuel Horváth 🔗 |
Fri 1:00 p.m. - 1:45 p.m.
|
FRUGAL: Memory-Efficient Optimization by Reducing State Overhead for Scalable Training ( Poster ) > link | Filipp Zmushko · Aleksandr Beznosikov · Martin Takac · Samuel Horváth 🔗 |
Fri 1:00 p.m. - 1:45 p.m.
|
Towards understanding of orthogonalization in Muon ( Poster ) > link | Valentyn Boreiko · Zhiqi Bu · Sheng Zha 🔗 |
Fri 1:00 p.m. - 1:45 p.m.
|
Poster Session #1
(
Poster Session
)
>
|
🔗 |
Fri 1:30 p.m. - 2:15 p.m.
|
Invited Keynote #4 ( Invited Talk ) > link | Fartash Faghri 🔗 |
Fri 2:15 p.m. - 2:30 p.m.
|
Offloaded Reasoning: Efficient Inference for Large Language Models via Modular Reasoning and Refinement ( Oral ) > link | Ishan Jindal · Jayant Taneja · Badrinath chandana · Vikas Kapur · SACHIN SHARMA 🔗 |
Fri 2:30 p.m. - 2:45 p.m.
|
Enhancing Reasoning Capabilities of Small Language Models with Blueprints and Prompt Template Search ( Oral ) > link | Dongge Han · Menglin Xia · Daniel Madrigal · Samuel Kessler · Ankur Mallick · Xuchao Zhang · Mirian Hipolito Garcia · Jin Xu · Victor Ruehle · Saravan Rajmohan 🔗 |
Fri 2:45 p.m. - 3:00 p.m.
|
Coffee Break
|
🔗 |
Fri 3:00 p.m. - 3:45 p.m.
|
FGFP: A Fractional Gaussian Filter and Pruning for Deep Neural Networks Compression ( Poster ) > link | Kuan-Ting Tu · Po-Hsien Yu · Yu-Syuan Tseng · Shao-Yi Chien 🔗 |
Fri 3:00 p.m. - 3:45 p.m.
|
Gatekeeper: Improving Model Cascades Through Confidence Tuning ( Poster ) > link | Stephan Rabanser · Nathalie Rauschmayr · Achin Kulshrestha · Petra Poklukar · Wittawat Jitkrittum · Sean Augenstein · Congchao Wang · Federico Tombari 🔗 |
Fri 3:00 p.m. - 3:45 p.m.
|
DiffusionBlocks: Continuous-Time Blockwise Training Through Score-Based Diffusion Models ( Poster ) > link | Makoto Shing · Takuya Akiba 🔗 |
Fri 3:00 p.m. - 3:45 p.m.
|
Compression of Large Language Models by Neuron Summary ( Poster ) > link | Yancheng Wang · Dongfang Sun · Yingzhen Yang 🔗 |
Fri 3:00 p.m. - 3:45 p.m.
|
Predictive Scheduling for Efficient Inference-Time Reasoning in Large Language Models ( Poster ) > link | Katrina Brown · Aneesh Muppidi · Rana Shahout 🔗 |
Fri 3:00 p.m. - 3:45 p.m.
|
FAST: Federated Active Learning with Foundation Models for Communication-efficient Sampling and Training ( Poster ) > link | Haoyuan Li · Mathias Funk · Jindong Wang · Aaqib Saeed 🔗 |
Fri 3:00 p.m. - 3:45 p.m.
|
Overcoming label shift in targeted federated learning ( Poster ) > link | Adam Breitholtz · Edvin Listo Zec · Fredrik Johansson 🔗 |
Fri 3:00 p.m. - 3:45 p.m.
|
Token-Efficient RL for LLM Reasoning ( Poster ) > link | Alan Lee · Harry Tong 🔗 |
Fri 3:00 p.m. - 3:45 p.m.
|
Dynamic Guardian Models: Realtime Content Moderation With User-Defined Policies ( Poster ) > link | Monte Hoover · Vatsal Baherwani · Neel Jain · Khalid Saifullah · Joseph Vincent · Chirag Jain · Melissa Rad · C. Bayan Bruss · Ashwinee Panda · Tom Goldstein 🔗 |
Fri 3:00 p.m. - 3:45 p.m.
|
First Provable Guarantees for Practical Private FL: Beyond Restrictive Assumptions ( Poster ) > link | Egor Shulgin · Grigory Malinovsky · Sarit Khirirat · Peter Richtarik 🔗 |
Fri 3:00 p.m. - 3:45 p.m.
|
Lion Cub: Minimizing Communication Overhead in Distributed Lion ( Poster ) > link | Satoki Ishikawa · Tal Ben-Nun · Brian Van Essen · Rio Yokota · Nikoli Dryden 🔗 |
Fri 3:00 p.m. - 3:45 p.m.
|
SPAM: Stochastic Proximal Point Method with Momentum Variance Reduction for Non-convex Cross-Device Federated Learning ( Poster ) > link | Avetik Karagulyan · Egor Shulgin · Abdurakhmon Sadiev · Peter Richtarik 🔗 |
Fri 3:00 p.m. - 3:45 p.m.
|
Poster Session #2
(
Poster Session
)
>
|
🔗 |
Fri 3:30 p.m. - 4:15 p.m.
|
Invited Keynote #5 ( Invited Talk ) > link | Kangwook Lee 🔗 |
Fri 4:15 p.m. - 4:30 p.m.
|
Best Paper/Poster Awards
(
Outro
)
>
|
Samuel Horváth 🔗 |
Fri 4:30 p.m. - 5:15 p.m.
|
Panel Session
(
Panel Session
)
>
|
Kangwook Lee · Zechun Liu · Fartash Faghri 🔗 |