Geometric structure of an optimization landscape is argued to be fundamentally important to support the success of deep learning. However, recent research efforts focused on either of toy random models with unrealistic assumptions and numerical evidences about different shapes of the optimization landscape, thereby lacking a unified view about the nature of the landscape. Here, we propose a statistical mechanics framework by directly building a least structured model of the high-dimensional weight space, considering realistic structured data, stochastic gradient descent algorithms, and the computational depth of the network parametrized by weight parameters. We also consider whether the number of network parameters outnumbers the number of supplied training data, namely, over- or under-parametrization. Our least structured model predicts that the weight spaces of the under-parametrization and over-parameterization cases belong to the same class. These weight spaces are well-connected w

Keywords: deep learning
Date: 2020/11/19 09:52

