[2102.11090] Position Information in Transformers: An Overview

Transformers are arguably the main workhorse in recent Natural Language Processing research. By definition a Transformer is invariant with respect to reorderings of the input. However, language is inherently sequential and word order is essential to the semantics and syntax of an utterance. In this paper, we provide an overview of common methods to incorporate position information into Transformer models. The objectives of this survey are to i) showcase that position information in Transformer is a vibrant and extensive research area; ii) enable the reader to compare existing methods by providing a unified notation and meaningful clustering; iii) indicate what characteristics of an application should be taken into account when selecting a position encoding; iv) provide stimuli for future research.

4 mentions: @mnschmit@arxivabs
Keywords: transformer
Date: 2021/02/23 23:21

Referring Tweets

@mnschmit Have you ever thought about the fact that a Transformer w/o a position model sees language as a bag of words? We (i.e., @PDufter, @HinrichSchuetze, and myself) just finished the first version of a survey on different position models for Transformers. t.co/UQUfYFHNqj 1/3

Related Entries

Read more [2007.01282] Leveraging Passage Retrieval with Generative Models for Open Domain Question Answeringo...
0 users, 3 mentions 2020/07/03 02:21
Read more Adversarial Score Matching and Consistent Sampling – Alexia Jolicoeur-Martineau
0 users, 6 mentions 2020/09/14 00:30
Read more [2009.12756] Answering Complex Open-Domain Questions with Multi-Hop Dense Retrievalopen searchopen n...
0 users, 3 mentions 2020/09/29 18:53
Read more [2101.07367] Training Learned Optimizers with Randomly Initialized Learned Optimizers
0 users, 4 mentions 2021/01/24 06:52
Read more [2102.06356] A Large Batch Optimizer Reality Check: Traditional, Generic Optimizers Suffice Across B...
0 users, 3 mentions 2021/02/15 02:21