ICML Poster Dimension-Independent Rates for Structured Neural Density Estimation

Poster

Dimension-Independent Rates for Structured Neural Density Estimation

Vandermeulen · Wai Ming Tai · Bryon Aragam

West Exhibition Hall B2-B3 #W-907

[ Abstract ] [ Lay Summary ]

[ OpenReview]

Wed 16 Jul 11 a.m. PDT — 1:30 p.m. PDT

Abstract: We show that deep neural networks can achieve dimension-independent rates of convergence for learning structured densities typical of image, audio, video, and text data. For example, in images, where each pixel becomes independent of the rest of the image when conditioned on pixels at most $t$ steps away, a simple $L^2$-minimizing neural network can attain a rate of $n^{-1/((t+1)^2+4)}$, where $t$ is independent of the ambient dimension $d$, i.e. the total number of pixels. We further provide empirical evidence that, in real-world applications, $t$ is often a small constant, thus effectively circumventing the curse of dimensionality. Moreover, for sequential data (e.g., audio or text) exhibiting a similar local dependence structure, our analysis shows a rate of $n^{-1/(t+5)}$, offering further evidence of dimension independence in practical scenarios.

Lay Summary:

Modern machine-learning systems often work with “high-dimensional” data—think of a photo with millions of pixels and three numbers for each pixel (red, green, and blue). Classical statistics predicts that learning from such datasets would require an impossibly large number of examples, yet deep learning succeeds with far fewer in practice. Our study offers an explanation for why: Real-world data has inherent structure that can be used by neural networks to learn more efficiently with less data. The idea is that only certain parts of the data are useful for predicting other parts, and this structure is nicely captured with commonly used neural network models. We show that this structure dramatically reduces the amount of data a neural network needs, bringing it down to the same level as for much smaller problems. This provides a fresh explanation for how neural networks learn effectively from the data that is available in modern applications.

Chat is not available.