[1712.00409] Deep Learning Scaling is Predictable, Empirically

Deep learning (DL) creates impactful advances following a virtuous recipe: model architecture search, creating large training data sets, and scaling computation. It is widely believed that growing training sets and models should improve accuracy and result in better products. As DL application domains grow, we would like a deeper understanding of the relationships between training set size, computational scale, and model accuracy improvements to advance the state-of-the-art. This paper presents a large scale empirical characterization of generalization error and model size growth as training sets grow. We introduce a methodology for this measurement and test four machine learning domains: machine translation, language modeling, image processing, and speech recognition. Our empirical results show power-law generalization error scaling across a breadth of factors, resulting in power-law exponents---the "steepness" of the learning curve---yet to be explained by theoretical work. Further

1 mentions: @Miles_Brundage
Keywords: deep learning
Date: 2019/06/09 23:15

Referring Tweets

@Miles_Brundage First, consider "Deep Learning Scaling is Predictable, Empirically" by Hestness et al. at Baidu - t.co/wWim0MVP2P Fantastic paper that shows clear empirical tendencies distinguishing different ML domains w.r.t. returns to data, with a common theme of log-linear returns.

Related Entries

Read more hagino3000's blog: IBIS2016 講演セッションのメモと感想
3 users, 0 mentions 2018/11/28 23:01
Read more TWIMLcon - AI Platforms - Machine learning & deep learning in the enterprise
0 users, 4 mentions 2019/06/05 03:47
Read more Council Post: Why Machine Learning Models Crash And Burn In Production
0 users, 1 mentions 2019/07/26 06:48
Read more machine learning - ROC vs precision-and-recall curves - Cross Validated
0 users, 1 mentions 2019/11/03 02:20
Read more IBIS2019参加記録 - カイヤン雑記帳
0 users, 3 mentions 2019/11/24 14:28