NVIDIA Clocks World’s Fastest BERT Training Time and Largest Transformer Based Model, Paving Path For Advanced Conversational AI | NVIDIA Developer Blog

NVIDIA Clocks World’s Fastest BERT Training Time and Largest Transformer Based Model, Paving Path For Advanced Conversational AI | NVIDIA Developer Blog

NVIDIA DGX SuperPOD trains BERT-Large in just 53 minutes, and trains GPT-2 8B, the largest Transformer Network Ever with 8.3Bn parameters  Conversational AI is an essential building block of human interactions with intelligent machines and applications – from robots and cars, to home assistants and mobile apps. Getting computers to understand human languages, with all their …

23 mentions: @ctnzr@jaguring1@Tdys13@AlisonBLowndes@_AlexEne_@Seanku@okajimania@rnalytics
Date: 2019/08/13 13:00

Referring Tweets

@jaguring1 BERT-Largeの事前学習には3日かかっていたが、NVIDIAが53分に短縮。世界最高記録。さらに、83億パラメータを持つGPT-2 8Bを作成(最大規模。BERTの24倍、GPT-2の5.6倍のサイズ)。コードはオープンソース。 t.co/jxnd4O0cJN t.co/Pe72ZVCW6Z
@AlisonBLowndes More info on @nvidia's MegaTron harnessing TensorRT and 8-way model parallelism: t.co/YftCrn0zV0 #AI #NLP #ConversationalAI t.co/riqFHm57dG
@ctnzr Three scaling breakthroughs for NLP: fastest BERT-Large training (under one hour), fastest BERT inference (2.2ms on T4), and largest Transformer (GPT-2 8.3B). Code is open source. t.co/MUq7XzPQFe
@okajimania EE Times の記事にあった、BERT-Large が1時間未満でトレーニング出来ちゃう() のソースはここか。 ※ DGX-2 が92基構成の NVIDIA DGX SuperPOD というスパコンの話です t.co/DS5eNh6pPm
@tunguz The NVIDIA DGX SuperPOD with 92 DGX-2H nodes set a new record by training BERTLARGE in just 53 minutes. This record was set using 1,472 V100 SXM3-32GB GPUs and 10 Mellanox Infiniband adapters per node t.co/YQsT0xoWoC
@maxpagels All yours for a hardware cost north of $35 million and 662 kWh peak power draw. t.co/YSmCL5IPAh

Related Entries

Read more GitHub - soskek/bert-chainer: Chainer implementation of "BERT: Pre-training of Deep Bidirectional Tr...
7 users, 0 mentions 2018/12/02 18:01
Read more [DL Hacks]BERT: Pre-training of Deep Bidirectional Transformers for L…
4 users, 5 mentions 2018/12/07 04:31
Read more [DL輪読会]BERT: Pre-training of Deep Bidirectional Transformers for Lang…
0 users, 0 mentions 2018/10/20 12:15
Read more GitHub - huggingface/pytorch-pretrained-BERT: The Big-&-Extending-Repository-of-Transformers: PyTorc...
1 users, 7 mentions 2019/03/04 21:47
Read more tensorflow2でhuggigfaceのtransformersを使ってBERTを文書分類モデルに転移学習する - メモ帳
0 users, 1 mentions 2019/10/22 12:50