DEBERTA: DECODING-ENHANCED BERT WITH DISENTANGLED ATTENTION | OpenReview
Recent progress in pre-trained neural language models has significantly improved the performance of many natural language processing (NLP) tasks. In this paper we propose a new model architecture...
1 mentions:


Keywords:
attention
Date: 2021/02/22 17:22