# [2005.10002] Statistical learning theory of structured dataopen searchopen navigation menucontact arXivarXiv Twitter

The traditional approach of statistical physics to supervised learning routinely assumes unrealistic generative models for the data: usually inputs are independent random variables, uncorrelated with their labels. Only recently, statistical physicists started to explore more complex forms of data, such as equally-labelled points lying on (possibly low dimensional) object manifolds. Here we provide a bridge between this recently-established research area and the framework of statistical learning theory, a branch of mathematics devoted to inference in machine learning. The overarching motivation is the inadequacy of the classic rigorous results in explaining the remarkable generalization properties of deep learning. We propose a way to integrate physical models of data into statistical learning theory, and address, with both combinatorial and statistical mechanics methods, the computation of the Vapnik-Chervonenkis entropy, which counts the number of different binary classifications comp