[1804.09849] The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation

The past year has witnessed rapid advances in sequence-to-sequence (seq2seq) modeling for Machine Translation (MT). The classic RNN-based approaches to MT were first out-performed by the convolutional seq2seq model, which was then out-performed by the more recent Transformer model. Each of these new approaches consists of a fundamental architecture accompanied by a set of modeling and training techniques that are in principle applicable to other seq2seq architectures. In this paper, we tease apart the new architectures and their accompanying techniques in two ways. First, we identify several key modeling and training techniques, and apply them to the RNN architecture, yielding a new RNMT+ model that outperforms all of the three fundamental architectures on the benchmark WMT'14 English to French and English to German tasks. Second, we analyze the properties of each fundamental seq2seq architecture and devise new hybrid architectures intended to combine their strengths. Our hybrid models

1 mentions: @_arohan_
Keywords: 機械翻訳
Date:

Referring Tweets

@_arohan_
@_arohan_ @srchvrs @hardmaru Yeah story as old as time itself, Linear + Deep t.co/45J6bDIPbF Transformer Encoder + LSTM decoder t.co/qTn53sBwXl Convolutions + ViT t.co/o7FzQY35od Transformer with convs t.co/Xh0IazlNhp

Related Entries

Towards Practical Second Order Optimization for Deep Learning | OpenReview
Read more Towards Practical Second Order Optimization for Deep Learning | OpenReview
0 users, 1 mentions 2021/01/15 18:54
google-research/matrix_functions.py at master · google-research/google-research · GitHub
Read more google-research/matrix_functions.py at master · google-research/google-research · GitHub
0 users, 1 mentions 2021/09/10 18:09
Differentiable Self-Adaptive Learning Rate | OpenReview
Read more Differentiable Self-Adaptive Learning Rate | OpenReview
0 users, 1 mentions 2021/11/11 06:09
[1606.07792] Wide & Deep Learning for Recommender Systems
Read more [1606.07792] Wide & Deep Learning for Recommender Systems
10 users, 1 mentions 2021/12/08 01:26
NeurIPS 2021
Read more NeurIPS 2021
0 users, 1 mentions 2021/12/10 04:37

ML-Newsについて

ML-Newsは機械学習に関するニュースサイトです。機械学習に関する論文ブログライブラリコンペティション発表資料勉強会などの最新の情報にアクセスできます。

機械学習を応用した自然言語処理、画像認識、情報検索などの分野の情報や機械学習で必要になるデータ基盤やMLOpsの話題もカバーしています。
安定したサイト運営のためにGitHub sponsorを募集しています。

お知らせ

  • 2021/12/31: デザインを刷新しました
  • 2021/04/08: 日本語Kaggleのカテゴリを新設しました