NVIDIA Clocks World’s Fastest BERT Training Time and Largest Transformer Based Model, Paving Path For Advanced Conversational AI | NVIDIA Developer Blog

NVIDIA DGX SuperPOD trains BERT-Large in just 53 minutes, and trains GPT-2 8B, the largest Transformer Network Ever with 8.3Bn parameters  Conversational AI is an essential building block of human interactions with intelligent machines and applications – from robots and cars, to home assistants and mobile apps. Getting computers to understand human languages, with all their …

12 mentions: @ctnzr@jaguring1@Tdys13@AlisonBLowndes@Seanku@rnalytics@tunguz@Mangapcom
Date: 2019/08/13 13:00

Referring Tweets

@ctnzr Three scaling breakthroughs for NLP: fastest BERT-Large training (under one hour), fastest BERT inference (2.2ms on T4), and largest Transformer (GPT-2 8.3B). Code is open source. https://t.co/MUq7XzPQFe
@jaguring1 BERT-Largeの事前学習には3日かかっていたが、NVIDIAが53分に短縮。世界最高記録。さらに、83億パラメータを持つGPT-2 8Bを作成(最大規模。BERTの24倍、GPT-2の5.6倍のサイズ)。コードはオープンソース。 https://t.co/jxnd4O0cJN https://t.co/Pe72ZVCW6Z
@Tdys13 BERTの学習を53分で終了(V100を1,472基) NVIDIA Clocks World’s Fastest BERT Training Time and Largest Transformer Based Model, Paving Path For Advanced Conversational AI | NVIDIA Developer Blog https://t.co/yVUwK3pzLH https://t.co/gdghMIB3hT

Related Entries

Read more GitHub - soskek/bert-chainer: Chainer implementation of "BERT: Pre-training of Deep Bidirectional Tr...
Read more [DL Hacks]BERT: Pre-training of Deep Bidirectional Transformers for L…
Read more The Annotated Transformer
Read more [DL輪読会]BERT: Pre-training of Deep Bidirectional Transformers for Lang…
Read more GitHub - huggingface/pytorch-pretrained-BERT: The Big-&-Extending-Repository-of-Transformers: PyTorc...