[2006.12862] Automatic Data Augmentation for Generalization in Deep Reinforcement Learningopen searchopen navigation menucontact arXivsubscribe to arXiv mailings

Deep reinforcement learning (RL) agents often fail to generalize to unseen scenarios, even when they are trained on many instances of semantically similar environments. Data augmentation has recently been shown to improve the sample efficiency and generalization of RL agents. However, different tasks tend to benefit from different kinds of data augmentation. In this paper, we compare three approaches for automatically finding an appropriate augmentation. These are combined with two novel regularization terms for the policy and value function, required to make the use of data augmentation theoretically sound for certain actor-critic algorithms. We evaluate our methods on the Procgen benchmark which consists of 16 procedurally-generated environments and show that it improves test performance by ~40% relative to standard RL algorithms. Our agent outperforms other baselines specifically designed to improve generalization in RL. In addition, we show that our agent learns policies and repres

4 mentions: @icoxfog417@ak92501@pm_girl
Date: 2020/06/26 14:21

Referring Tweets

@icoxfog417 深層強化学習で適切なData Augmentationを自動選択する研究。選択に多腕バンディッドを使用する一方、Augmentationにより大幅に価値関数/戦略が変わるのは理論上おかしいため(実態は同じ状態のため)、Augmentation前後の戦略の分布距離/価値関数値の差異で正則化を行っている t.co/YMoW1fjyUv
@ak92501 Automatic Data Augmentation for Generalization in Deep Reinforcement Learning pdf: t.co/cpmpQCPqX8 abs: t.co/VljGQIUlMe project page: t.co/noQYJQzWUZ github: t.co/nO5w01wOXh t.co/70HxDoz4TY

Related Entries

Read more [1902.09243] Pretraining-Based Natural Language Generation for Text Summarization
0 users, 3 mentions 2019/04/19 03:48
Read more [2001.07966] ImageBERT: Cross-modal Pre-training with Large-scale Weak-supervised Image-Text Datacon...
0 users, 8 mentions 2020/01/30 00:52
Read more [2004.00448] Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Analysis and a...
0 users, 5 mentions 2020/04/04 11:22
Read more [2005.11295] From ImageNet to Image Classification: Contextualizing Progress on Benchmarksopen searc...
0 users, 3 mentions 2020/05/27 14:21
Read more [2006.03463] Sponge Examples: Energy-Latency Attacks on Neural Networksopen searchopen navigation me...
0 users, 7 mentions 2020/06/08 05:21