ICML Poster Low-Rank Thinning

Poster

Low-Rank Thinning

Annabelle Carrell · Albert Gong · Abhishek Shetty · Raaz Dwivedi · Lester Mackey

West Exhibition Hall B2-B3 #W-1012

[ Abstract ] [ Lay Summary ] [ Project Page ]

[ Slides] [ Poster] [ OpenReview]

Tue 15 Jul 4:30 p.m. PDT — 7 p.m. PDT

Abstract:

The goal in thinning is to summarize a dataset using a small set of representative points. Remarkably, sub-Gaussian thinning algorithms like Kernel Halving and Compress can match the quality of uniform subsampling while substantially reducing the number of summary points. However, existing guarantees cover only a restricted range of distributions and kernel-based quality measures and suffer from pessimistic dimension dependence. To address these deficiencies, we introduce a new low-rank analysis of sub-Gaussian thinning that applies to any distribution and any kernel, guaranteeing high-quality compression whenever the kernel or data matrix is approximately low-rank. To demonstrate the broad applicability of the techniques, we design practical sub-Gaussian thinning approaches that improve upon the best known guarantees for approximating attention in transformers, accelerating stochastic gradient training through reordering, and distinguishing distributions in near-linear time.

Lay Summary:

The goal in thinning is to summarize a dataset using a small set of representative points. Remarkably, recently-developed thinning algorithms can match the quality of sampling without replacement while substantially reducing the number of summary points. However, existing guarantees are overly restrictive and pessimistic. To address these deficiencies, we introduce a new analysis of thinning that applies to any distribution and any kernel, guaranteeing high-quality compression whenever the kernel or data matrix is approximately low-rank. To demonstrate the broad applicability of the techniques, we design practical thinning approaches that improve upon the best known guarantees for approximating the quadratic-time computations in neural networks, speeding up model training through example reordering, and rapidly detecting salient differences between datasets.

Chat is not available.