Nvidia trains world’s largest Transformer-based language model | VentureBeat

Nvidia said it has trained the world's largest Transformer-based models and achieved the fastest training and inference for Google's popular BERT model.

51 mentions: @SanhEstPasMoi@stanfordnlp@neptanum@EPro@taniajacob@IntelaSolutions@omarsar0@AIPerspective
Date: 2019/08/13 13:00

Referring Tweets

@stanfordnlp “Nvidia was able to train BERT-Large using optimized PyTorch software and a DGX-SuperPOD of more than 1,000 GPUs that is able to train BERT in 53 minutes.” – ⁦⁦@kharijohnson⁩, ⁦@VentureBeatt.co/9gT3aZTsBs
@SanhEstPasMoi "The model uses 8.3 billion parameters and is 24 times larger than BERT and 5 times larger than OpenAI’s GPT-2" Why am I not surprised? For such a computational effort, I hope the weights will be released publicly... t.co/0qZkEllRqQ
@EPro The model uses 8.3 billion parameters and is 24 times larger than BERT, says @nvidia t.co/RQ8P2omjOV
@omarsar0 Not exactly how you would typically end an article but this reflects the state of things in our field. When I read the sentences it feels like it is just a race. From a researcher's perspective, these words spark the emotion of fear... not good! t.co/kuVtYhOuSG t.co/rdfIHmSU1X
@AIPerspective Nvidia today announced that it has trained the world’s largest language model, just the latest in a series of updates the GPU maker has aimed at advancing conversational AI t.co/1QoiXjfb3Q
@neptanum Nvidia Trains World’s Largest Language Model on 1000 GPUs in 53 Minutes on DGX SuperPod #BigData #Analytics #DataScience #AI #MachineLearning #NLProc #IoT #Python #RStats #JavaScript #ReactJS #CloudComputing #Serverless #DataScientist #Linux t.co/PoiG7nos3n @gp_pulipaka t.co/L9MNnKXatl

Related Entries

Read more GitHub - soskek/bert-chainer: Chainer implementation of "BERT: Pre-training of Deep Bidirectional Tr...
7 users, 0 mentions 2018/12/02 18:01
Read more [DL Hacks]BERT: Pre-training of Deep Bidirectional Transformers for L…
4 users, 5 mentions 2018/12/07 04:31
Read more The Annotated Transformer
0 users, 0 mentions 2018/08/27 01:24
Read more [DL輪読会]BERT: Pre-training of Deep Bidirectional Transformers for Lang…
0 users, 0 mentions 2018/10/20 12:15
Read more GitHub - huggingface/pytorch-pretrained-BERT: The Big-&-Extending-Repository-of-Transformers: PyTorc...
1 users, 7 mentions 2019/03/04 21:47