Nvidia trains world’s largest Transformer-based language model | VentureBeat

Nvidia said it has trained the world's largest Transformer-based models and achieved the fastest training and inference for Google's popular BERT model.

20 mentions: @SanhEstPasMoi@stanfordnlp@neptanum@EPro@taniajacob@IntelaSolutions@omarsar0@AIPerspective
Date: 2019/08/13 13:00

Referring Tweets

@SanhEstPasMoi "The model uses 8.3 billion parameters and is 24 times larger than BERT and 5 times larger than OpenAI’s GPT-2" Why am I not surprised? For such a computational effort, I hope the weights will be released publicly... https://t.co/0qZkEllRqQ
@stanfordnlp “Nvidia was able to train BERT-Large using optimized PyTorch software and a DGX-SuperPOD of more than 1,000 GPUs that is able to train BERT in 53 minutes.” – ⁦⁦@kharijohnson⁩, ⁦@VentureBeat⁩ https://t.co/9gT3aZTsBs
@neptanum Nvidia Trains World’s Largest Language Model on 1000 GPUs in 53 Minutes @VentureBeat. #Analytics #DataScience #AI #MachineLearning #NLProc #IoT #IIoT #PyTorch #Python #RStats #JavaScript #ReactJS #GoLang #CloudComputing #Serverless #Linux @gp_pulipaka https://t.co/PoiG7nos3n https://t.co/wExq6kz030

Related Entries

Read more GitHub - soskek/bert-chainer: Chainer implementation of "BERT: Pre-training of Deep Bidirectional Tr...
Read more [DL Hacks]BERT: Pre-training of Deep Bidirectional Transformers for L…
Read more The Annotated Transformer
Read more [DL輪読会]BERT: Pre-training of Deep Bidirectional Transformers for Lang…
Read more GitHub - huggingface/pytorch-pretrained-BERT: The Big-&-Extending-Repository-of-Transformers: PyTorc...