When pre-trained on large unsupervised textual corpora, language models are able to store and retrieve factual knowledge to some extent, making it possible to use them directly for zero-shot cloze-style question answering. However, storing factual knowledge in a fixed number of weights of a language model clearly has limitations. Previous approaches have successfully provided access to information outside the model weights using supervised architectures that combine an information retrieval system with a machine reading component. In this paper, we go a step further and integrate information from a retrieval system with a pre-trained language model in a purely unsupervised way. We report that augmenting pre-trained language models in this way dramatically improves performance and that the resulting system, despite being unsupervised, is competitive with a supervised machine reading baseline. Furthermore, processing query and context with different segment tokens allows BERT to utilize

Date: 2020/06/27 02:21

@icoxfog417 Fine Tuneなし事前学習済みモデルのみでQAを解く研究。質問qにコンテキストcを加える(SEP/EOSで区切る)ことで、教師ありベースライン(DrQA)と同等性能が出せることを確認(cはIRで抽出(TF-IDFでも効果有)、qから生成等)。不遇なBERTの次文予測に効果があることも確認。 t.co/lBoFdmpjt9
@kurama554101 t.co/VQMEvTp12K Fine TuneなしでQAを解いてるらしい。これから読むが、Next Sentece Predictionを使ってるみたい。 他のBERTベースの論文だと、MLMの亜種は多いけど、Next Sentece Predictionは使われてる印象無いから、興味ある。

