Gradient Descent Maximizes the Margin of Homogeneous Neural Networks Sep 25, 2019 ICLR 2020 Conference Blind Submission readers: everyone Show Bibtex TL;DR: We study the implicit bias of gradient descent and prove under a minimal set of assumptions that the parameter direction of homogeneous models converges to KKT points of a natural margin maximization problem. Abstract: In this paper, we study the implicit regularization of the gradient descent algorithm in homogeneous neural networks

@hillbig NNは斉次函数(ReLUを使った時など)であり指数損失(クロスエントロピー損失など)を使って学習する場合、勾配降下法による学習は正規化マージン最大化に対応し、その収束レートも示せる。はじめて現実的な条件で非線形多層NNで最大マージン化がされていることが証明された
@hillbig They proved that when homogeneous NNs (e.g., ReLU activation) are trained with cross-entropy loss, the gradient descent algorithm implicitly maximizes the normalized margin. The first proof for non-linear NN under realistic assumptions.

