Poster
Beyond Communication Overhead: A Multilevel Monte Carlo Approach for Mitigating Compression Bias in Distributed Learning
Ze'ev Zukerman · Bassel Hamoud · Kfir Levy
West Exhibition Hall B2-B3 #W-515
In large-scale machine learning, especially when training very large models like ChatGPT, computers often work together by exchanging information, but this communication can become a major bottleneck. To save bandwidth, systems compress the data they send. However, this introduces a trade-off: the most efficient compressions reduce theoretical reliability, while the safest ones reduce the efficiency of the training process. Our work introduces a new technique that uses a concept from statistics called “Multilevel Monte Carlo” to get the best of both worlds: fast, efficient communication with reliable learning guarantees. We show how this approach turns even biased, aggressive compressions into accurate and trustworthy information. This helps machine learning systems train faster across many devices, without sacrificing robustness or accuracy.