[2005.04611] How Context Affects Language Models' Factual Predictionsopen searchopen navigation menucontact arXivsubscribe to arXiv mailings

When pre-trained on large unsupervised textual corpora, language models are able to store and retrieve factual knowledge to some extent, making it possible to use them directly for zero-shot cloze-style question answering. However, storing factual knowledge in a fixed number of weights of a language model clearly has limitations. Previous approaches have successfully provided access to information outside the model weights using supervised architectures that combine an information retrieval system with a machine reading component. In this paper, we go a step further and integrate information from a retrieval system with a pre-trained language model in a purely unsupervised way. We report that augmenting pre-trained language models in this way dramatically improves performance and that the resulting system, despite being unsupervised, is competitive with a supervised machine reading baseline. Furthermore, processing query and context with different segment tokens allows BERT to utilize

3 mentions: @icoxfog417@kurama554101
Date: 2020/06/27 02:21

Referring Tweets

@icoxfog417 Fine Tuneなし事前学習済みモデルのみでQAを解く研究。質問qにコンテキストcを加える(SEP/EOSで区切る)ことで、教師ありベースライン(DrQA)と同等性能が出せることを確認(cはIRで抽出(TF-IDFでも効果有)、qから生成等)。不遇なBERTの次文予測に効果があることも確認。 t.co/lBoFdmpjt9
@kurama554101 t.co/VQMEvTp12K Fine TuneなしでQAを解いてるらしい。これから読むが、Next Sentece Predictionを使ってるみたい。 他のBERTベースの論文だと、MLMの亜種は多いけど、Next Sentece Predictionは使われてる印象無いから、興味ある。

Related Entries

Read more [1902.10547] An Embarrassingly Simple Approach for Transfer Learning from Pretrained Language Models
0 users, 4 mentions 2019/02/28 21:48
Read more [1809.06963] Multi-task Learning with Sample Re-weighting for Machine Reading Comprehension
0 users, 1 mentions 2019/04/16 02:18
Read more Mogrifier LSTM | OpenReview
0 users, 1 mentions 2020/01/19 05:20
Read more Weekly Machine Learning #162 | RevueRevue
0 users, 1 mentions 2020/02/08 03:50
Read more [1907.10641] WinoGrande: An Adversarial Winograd Schema Challenge at Scalecontact arXivarXiv Twitter
0 users, 4 mentions 2020/02/10 15:01