[2102.09690] Calibrate Before Use: Improving Few-Shot Performance of Language Models

GPT-3 can perform numerous tasks when provided a natural language prompt that contains a few training examples. We show that this type of few-shot learning can be unstable: the choice of prompt format, training examples, and even the order of the training examples can cause accuracy to vary from near chance to near state-of-the-art. We demonstrate that this instability arises from the bias of language models towards predicting certain answers, e.g., those that are placed near the end of the prompt or are common in the pre-training data. To mitigate this, we first estimate the model's bias towards each answer by asking for its prediction when given the training prompt and a content-free test input such as "N/A". We then fit calibration parameters that cause the prediction for this input to be uniform across answers. On a diverse set of tasks, this contextual calibration procedure substantially improves GPT-3 and GPT-2's average accuracy (up to 30.0% absolute) and reduces variance across

6 mentions: @arankomatsuzaki@jaguring1@johnplattml@ak92501@arxivabs
Date: 2021/02/22 02:21

Referring Tweets

@jaguring1 多数のタスクを解くことができる言語モデル「GPT-3」の入力を工夫することで、さまざまなタスクでGPT-3の性能が大幅に向上するとのこと。 Calibrate Before Use: Improving Few-Shot Performance of Language Models t.co/wrOg1SKPZt t.co/mqCd9nr7PK
@johnplattml Calibration is useful for "prompted" few-shot learning with modern large language models. t.co/xh2h76FTrN Neat! #DeepLearning #GPT3
@arankomatsuzaki Calibrate Before Use: Improving Few-Shot Performance of Language Models Proposes contextual calibration to improves GPT-3's few-shot performance. t.co/gk7paHFuik t.co/jDfDC9c46l

Related Entries

Read more [2003.10685] Deep Line Art Video Colorization with a Few Referencescontact arXivarXiv Twitter
0 users, 3 mentions 2020/03/26 06:51
Read more [2009.11278] X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal Transformersopen searcho...
0 users, 5 mentions 2020/09/24 03:52
Read more [2102.02779] Unifying Vision-and-Language Tasks via Text Generation
0 users, 2 mentions 2021/02/05 03:51
Read more [2102.03334] ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision
0 users, 4 mentions 2021/02/13 17:21
Read more [2102.12122] Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convoluti...
0 users, 7 mentions 2021/02/26 15:52