ICML Poster Learning multivariate Gaussians with imperfect advice

Poster

Learning multivariate Gaussians with imperfect advice

Arnab Bhattacharyya · Davin Choo · Philips George John · Themistoklis Gouleakis

West Exhibition Hall B2-B3 #W-1017

[ Abstract ] [ Lay Summary ]

[ OpenReview]

Tue 15 Jul 11 a.m. PDT — 1:30 p.m. PDT

Abstract: We revisit the problem of distribution learning within the framework of learning-augmented algorithms.In this setting, we explore the scenario where a probability distribution is provided as potentially inaccurate advice on the true, unknown distribution. Our objective is to develop learning algorithms whose sample complexity decreases as the quality of the advice improves, thereby surpassing standard learning lower bounds when the advice is sufficiently accurate. Specifically, we demonstrate that this outcome is achievable for the problem of learning a multivariate Gaussian distribution $N(\mu, \Sigma)$ in the PAC learning setting. Classically, in the advice-free setting, $\widetilde{\Theta}(d^2/\varepsilon^2)$ samples are sufficient and worst case necessary to learn $d$-dimensional Gaussians up to TV distance $\varepsilon$ with constant probability. When we are additionally given a parameter $\widetilde{\Sigma}$ as advice, we show that $\widetilde{\mathcal{O}}(d^{2-\beta}/\varepsilon^2)$ samples suffices whenever $|| \widetilde{\Sigma}^{-1/2} \Sigma \widetilde{\Sigma}^{-1/2} - I_d ||_1 \leq \varepsilon d^{1-\beta}$ (where $||\cdot||_1$ denotes the entrywise $\ell_1$ norm) for any $\beta > 0$, yielding a polynomial improvement over the advice-free setting.

Lay Summary:

Estimating the mean and covariance of a multivariate Gaussian distribution is a well-known problem in machine learning. In the worst case, it requires a number of samples that grows quadratically with the number of variates/features. We study a new setting where, in addition to data samples, we are given imperfect advice in the form of predictions/guesses for the mean and covariance. These predictions may come from prior models or expert knowledge, but we have no guarantees about their accuracy. We design an algorithm that first tests whether the advice is reliable. If it is, we use it to reduce the number of samples needed, applying tools from convex optimization. If it isn’t, we default to standard estimators. Our method is always correct and provably uses fewer samples when the advice is good. We also show that the trade-off between advice quality and sample efficiency is close to the best possible.

Chat is not available.