Understanding Why Neural Networks Generalize Well Through GSNR of Parameters | OpenReview

Understanding Why Neural Networks Generalize Well Through GSNR of Parameters Sep 25, 2019 Blind Submission readers: everyone Show Bibtex Abstract: As deep neural networks (DNNs) achieve tremendous success across many application domains, researchers tried to explore in many aspects on why they generalize well. In this paper, we provide a novel perspective on these issues using the gradient signal to noise ratio (GSNR) of parameters during training process of DNNs. The GSNR of a parameter i

2 mentions: @hillbig@hillbig
Date: 2020/05/21 00:52

Referring Tweets

@hillbig 学習中の勾配のサンプルあたりの平均^2/分散(GSNR)は汎化性能と関係があり、なぜNNの汎化性能が良いかを部分的に説明できる。初期化時GSNRは低く、学習中にGSNRが向上し汎化するように学習される。学習初期にデータによらない抽象的なパターンが獲得されていると考えられるt.co/Pntqxch8qV
@hillbig GSNR (ratio between gradient's squared mean and variance) relates to the generalization gap and can partly explain why NN generalizes well. GSNR of NN increases in the early epochs because NN learns abstract patterns that generalize well. t.co/Pntqxch8qV

Related Entries

Read more [1810.11943] Variational Inference with Tail-adaptive f-Divergence
0 users, 2 mentions 2018/12/11 11:30
Read more [1801.04295] Generalization Error Bounds for Noisy, Iterative Algorithms
0 users, 2 mentions 2019/01/21 00:46
Read more [1904.04971] Soft Conditional Computation
0 users, 2 mentions 2019/04/25 02:17
Read more struct2depth
1 users, 2 mentions 2019/06/24 05:15
Read more GenDICE: Generalized Offline Estimation of Stationary Values | OpenReview
0 users, 2 mentions 2019/11/24 23:21
Read more [1907.01341] Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset ...
0 users, 3 mentions 2020/02/02 23:21