[2009.06732] Efficient Transformers: A Surveyopen searchopen navigation menucontact arXivsubscribe to arXiv mailings

Transformer model architectures have garnered immense interest lately due to their effectiveness across a range of domains like language, vision and reinforcement learning. In the field of natural language processing for example, Transformers have become an indispensable staple in the modern deep learning stack. Recently, a dizzying number of \emph{"X-former"} models have been proposed - Reformer, Linformer, Performer, Longformer, to name a few - which improve upon the original Transformer architecture, many of which make improvements around computational and memory \emph{efficiency}. With the aim of helping the avid researcher navigate this flurry, this paper characterizes a large and thoughtful selection of recent efficiency-flavored "X-former" models, providing an organized and comprehensive overview of existing work and models across multiple domains.

15 mentions: @hardmaru@jaguring1@omarsar0@ytay017@cosminnegruseri@KevinKaichuang@ComputingByArts@ZachBessinger
Date: 2020/09/16 05:21

Referring Tweets

@omarsar0 This year we have been seeing a variety of works introducing techniques for improving Transformers and making them more computational and memory efficient. This recent survey paper highlights some notable ideas like Reformer and Longformer, among others. t.co/VRcLGPaRmQ t.co/mfe4cwAzYl
@ZachBessinger Glad to see a survey paper put together on many of the recent advances in #Transformers! Looking forward to reading this. t.co/iEC7LHd5i7
@hardmaru Finally, a paper that summarizes the recent improvements that make the Transformer much more efficient! t.co/FBHs24xzRw t.co/4Ayqh3NbVp t.co/2kwW6O7uIQ
@jaguring1 グーグルの研究者たちが、自然言語処理やコンピュータビジョンや強化学習などの様々な分野で成果を出している「Transformer」の改良手法についてまとめている。 Efficient Transformers: A Survey t.co/tYjk0TFFx7 t.co/o9qVpmC41F
@cosminnegruseri Recipe for transformer papers: 1 think of cool math/algorithmic trick 2 combine it with the self attention 3 profit? See the recent survey paper: t.co/dobR61DrJj t.co/J5l0T9UOuR
@ComputingByArts @anirbanghosh357 @nickcammarata > Did I hear Openai become closed AI It's just one organization. They are great (and not all that closed), but the overall flow of progress in AI is now overwhelming: t.co/ou7sUazDJx t.co/vYa8sWnfhg t.co/9gM6NKtBak
@ytay017 Inspired by the dizzying number of efficient Transformers ("x-formers") models that are coming out lately, we wrote a survey paper to organize all this information. Check it out at t.co/nAaTLG8wOp. Joint work with @m__dehghani @dara_bahri and @metzlerd. @GoogleAI 😀😃 t.co/0M7a0oCqdj
@KevinKaichuang Nice survey of efficient transformer variants, in case anybody else finds this zoo as confusing as I do. @ytay017 @m__dehghani @dara_bahri @metzlerd t.co/Ir4pDwMU4b t.co/DIYojsOuHX
@ComputingByArts A survey covering all kind of recent work on Efficient Transformers (by Google Research): t.co/ou7sUazDJx t.co/ktN4xyTKFR

Related Entries

Read more NLP Newsletter #10: Improving Reproducibility in ML, Privacy and Security in NLP, EXTREME, Longforme...
0 users, 3 mentions 2020/04/19 16:03
Read more PyTorch Performance Tuning Guide - Szymon Migacz, NVIDIA - YouTube
1 users, 9 mentions 2020/08/31 20:21
Read more Elastic Transformers. Making BERT stretchy — Scalable… | by Mihail Dungarov | Sep, 2020 | Medium
0 users, 3 mentions 2020/09/07 21:53
Read more PyTorch Lightning Bolts — From Linear, Logistic Regression on TPUs to pre-trained GANs | by PyTorch ...
1 users, 2 mentions 2020/09/11 18:57
Read more Ellen Novoseller | PhD Candidate, California Institute of Technology
0 users, 2 mentions 2020/09/15 03:52