[2102.06571] Bayesian Neural Network Priors Revisited

Isotropic Gaussian priors are the de facto standard for modern Bayesian neural network inference. However, such simplistic priors are unlikely to either accurately reflect our true beliefs about the weight distributions, or to give optimal performance. We study summary statistics of neural network weights in different networks trained using SGD. We find that fully connected networks (FCNNs) display heavy-tailed weight distributions, while convolutional neural network (CNN) weights display strong spatial correlations. Building these observations into the respective priors leads to improved performance on a variety of image classification datasets. Moreover, we find that these priors also mitigate the cold posterior effect in FCNNs, while in CNNs we see strong improvements at all temperatures, and hence no reduction in the cold posterior effect.

5 mentions: @vincefort@Montreal_AI@saqibali_ca
Date: 2021/02/21 03:51

Referring Tweets

@vincefort Have you ever wondered whether isotropic Gaussian priors are good enough for your Bayesian neural network weights? They are often used in practice, but we find in our new paper (t.co/EoDp5J7FA2) that they are indeed suboptimal! Details in thread. 1/13 t.co/8goOoenu7s

Related Entries

Read more [2007.01179] Relating by Contrasting: A Data-efficient Framework for Multimodal Generative Modelsope...
0 users, 3 mentions 2020/07/04 14:22
Read more Adversarial Score Matching and Consistent Sampling – Alexia Jolicoeur-Martineau
0 users, 6 mentions 2020/09/14 00:30
Read more [2004.04795] Exemplar VAE: Linking Generative Models, Nearest Neighbor Retrieval, and Data Augmentat...
0 users, 3 mentions 2020/12/10 15:51
Read more [2012.15856] Studying Strategically: Learning to Mask for Closed-book QA
0 users, 3 mentions 2021/01/04 03:51
Read more [2101.07367] Training Learned Optimizers with Randomly Initialized Learned Optimizers
0 users, 4 mentions 2021/01/24 06:52