Gradient Descent Maximizes the Margin of Homogeneous Neural Networks | OpenReview

Gradient Descent Maximizes the Margin of Homogeneous Neural Networks Sep 25, 2019 ICLR 2020 Conference Blind Submission readers: everyone Show Bibtex TL;DR: We study the implicit bias of gradient descent and prove under a minimal set of assumptions that the parameter direction of homogeneous models converges to KKT points of a natural margin maximization problem. Abstract: In this paper, we study the implicit regularization of the gradient descent algorithm in homogeneous neural networks

2 mentions: @hillbig@hillbig
Date: 2019/11/09 02:20

Referring Tweets

@hillbig NNは斉次函数(ReLUを使った時など)であり指数損失(クロスエントロピー損失など)を使って学習する場合、勾配降下法による学習は正規化マージン最大化に対応し、その収束レートも示せる。はじめて現実的な条件で非線形多層NNで最大マージン化がされていることが証明された
@hillbig They proved that when homogeneous NNs (e.g., ReLU activation) are trained with cross-entropy loss, the gradient descent algorithm implicitly maximizes the normalized margin. The first proof for non-linear NN under realistic assumptions.

Related Entries

Read more Neural networks for Graph Data NeurIPS2018読み会@PFN
25 users, 9 mentions 2019/01/26 09:46
Read more GitHub - thunlp/GNNPapers: Must-read papers on graph neural networks (GNN)
1 users, 0 mentions 2019/08/18 08:16
Read more Deep Forest :Deep Neural Networkの代替へ向けて - QiitaQiita
0 users, 0 mentions 2018/04/25 17:22
Read more GitHub - NervanaSystems/nlp-architect: A model library for exploring state-of-the-art deep learning ...
3 users, 2 mentions 2019/11/03 14:20
Read more GitHub - nnzhan/Awesome-Graph-Neural-Networks: Paper Lists for Graph Neural Networks
0 users, 2 mentions 2019/05/21 15:48