[1909.03622] Transfer Reward Learning for Policy Gradient-Based Text Generation

Task-specific scores are often used to optimize for and evaluate the performance of conditional text generation systems. However, such scores are non-differentiable and cannot be used in the standard supervised learning paradigm. Hence, policy gradient methods are used since the gradient can be computed without requiring a differentiable objective. However, we argue that current n-gram overlap based measures that are used as rewards can be improved by using model-based rewards transferred from tasks that directly compare the similarity of sentence pairs. These reward models either output a score of sentence-level syntactic and semantic similarity between entire predicted and target sentences as the expected return, or for intermediate phrases as segmented accumulative rewards. We demonstrate that using a \textit{Transferable Reward Learner} leads to improved results on semantical evaluation measures in policy-gradient models for image captioning tasks. Our InferSent actor-critic mo

1 mentions: @JamesONeil21
Date: 2019/09/10 08:18

Referring Tweets

@JamesONeil21 Transfer Reward Learning for Policy Gradient-Based Text Generation (t.co/vTvUpVod7K) - Rewards from pretrained sentence similarity models are used to improve on many existing scores and semantic scores such as Word Mover's Distance and Sliding-Window CosSim @Bollegala t.co/Eet4Q6Y0Nd

Related Entries

Read more [DL輪読会]Adversarial Text Generation via Feature-Mover's Distance (NIPS…
0 users, 0 mentions 2018/11/12 11:14
Read more [DL輪読会]Adversarial Feature Matching for Text Generation
0 users, 0 mentions 2018/04/23 11:41
Read more Posting on ArXiv is good, flag planting notwithstanding. This piece by Yoav Goldberg has been widel...
0 users, 0 mentions 2018/04/22 03:40
Read more [1904.09675] BERTScore: Evaluating Text Generation with BERT
0 users, 1 mentions 2019/10/02 18:48
Read more GitHub - google/sentencepiece: Unsupervised text tokenizer for Neural Network-based text generation.
99 users, 1 mentions 2019/01/27 02:16