Gradient Descent Maximizes the Margin of Homogeneous Neural Networks | OpenReview

Gradient Descent Maximizes the Margin of Homogeneous Neural Networks Sep 25, 2019 ICLR 2020 Conference Blind Submission readers: everyone Show Bibtex TL;DR: We study the implicit bias of gradient descent and prove under a minimal set of assumptions that the parameter direction of homogeneous models converges to KKT points of a natural margin maximization problem. Abstract: In this paper, we study the implicit regularization of the gradient descent algorithm in homogeneous neural networks

2 mentions: @hillbig@hillbig
Date: 2019/11/09 02:20

Referring Tweets

@hillbig NNは斉次函数(ReLUを使った時など)であり指数損失(クロスエントロピー損失など)を使って学習する場合、勾配降下法による学習は正規化マージン最大化に対応し、その収束レートも示せる。はじめて現実的な条件で非線形多層NNで最大マージン化がされていることが証明された t.co/GDSd1fc3uW
@hillbig They proved that when homogeneous NNs (e.g., ReLU activation) are trained with cross-entropy loss, the gradient descent algorithm implicitly maximizes the normalized margin. The first proof for non-linear NN under realistic assumptions. t.co/GDSd1fc3uW

Related Entries

Read more struct2depth
1 users, 2 mentions 2019/06/24 05:15
Read more Truth or backpropaganda? An empirical investigation of deep learning theory | OpenReview
1 users, 2 mentions 2019/11/14 23:20
Read more [1705.08790] The Lovász-Softmax loss: A tractable surrogate for the optimization of the intersection...
0 users, 2 mentions 2020/01/20 00:51
Read more [2002.08056] The Geometry of Sign Gradient Descentcontact arXivarXiv Twitter
0 users, 3 mentions 2020/02/21 21:50
Read more [1912.03207] NASA: Neural Articulated Shape Approximationopen searchopen navigation menucontact arXi...
0 users, 3 mentions 2020/04/27 00:51