Poster
Dimension-Independent Rates for Structured Neural Density Estimation
Vandermeulen · Wai Ming Tai · Bryon Aragam
West Exhibition Hall B2-B3 #W-907
Modern machine-learning systems often work with “high-dimensional” data—think of a photo with millions of pixels and three numbers for each pixel (red, green, and blue). Classical statistics predicts that learning from such datasets would require an impossibly large number of examples, yet deep learning succeeds with far fewer in practice. Our study offers an explanation for why: Real-world data has inherent structure that can be used by neural networks to learn more efficiently with less data. The idea is that only certain parts of the data are useful for predicting other parts, and this structure is nicely captured with commonly used neural network models. We show that this structure dramatically reduces the amount of data a neural network needs, bringing it down to the same level as for much smaller problems. This provides a fresh explanation for how neural networks learn effectively from the data that is available in modern applications.