[2010.15920] Recovery RL: Safe Reinforcement Learning with Learned Recovery Zones

Safety remains a central obstacle preventing widespread use of RL in the real world: learning new tasks in uncertain environments requires extensive exploration, but safety requires limiting exploration. We propose Recovery RL, an algorithm which navigates this tradeoff by (1) leveraging offline data to learn about constraint violating zones before policy learning and (2) separating the goals of improving task performance and constraint satisfaction across two policies: a task policy that only optimizes the task reward and a recovery policy that guides the agent to safety when constraint violation is likely. We evaluate Recovery RL on 6 simulation domains, including two contact-rich manipulation tasks and an image-based navigation task, and an image-based obstacle avoidance task on a physical robot. We compare Recovery RL to 5 prior safe RL methods which jointly optimize for task performance and safety via constrained optimization or reward shaping and find that Recovery RL outperforms

1 mentions: @Ken_Goldberg
Keywords: 強化学習

Referring Tweets

@Ken_Goldberg A hearty congratulations to freshly minted @UCBerkeley PhDs: Dr Ashwin Balakrishna @ashwinb96 and Dr Brijen Thananjeyan @brthananjeyan @AUTOLab_Cal @Berkeley_EECS who both changed the way we think about safe robot learning: t.co/aaksiviv9z

Related Entries

自然言語におけるアノテーションのつらさをまとめる – chakki – Medium
Read more 自然言語におけるアノテーションのつらさをまとめる – chakki – Medium
0 users, 0 mentions 2018/09/03 03:23
【記事更新】私のブックマーク「自然言語処理による文法誤り訂正 (Grammatical Error Correction based on NLP)」 – 人工知能学会 (The Japanese S...
Read more 【記事更新】私のブックマーク「自然言語処理による文法誤り訂正 (Grammatical Error Correction based on NLP)」 – 人工知能学会 (The Japanese S...
0 users, 0 mentions 2018/11/01 05:45
MLFlowを使ってみる1 - tracking - iMind Developers Blog
Read more MLFlowを使ってみる1 - tracking - iMind Developers Blog
2 users, 0 mentions 2019/06/14 13:32
与信ポートフォリオの信用VaRの計測について - crtaker’s blog
Read more 与信ポートフォリオの信用VaRの計測について - crtaker’s blog
0 users, 1 mentions 2020/08/22 16:53
LAFI 2022 - The Seventh International Workshop on Languages for Inference - POPL 2022
Read more LAFI 2022 - The Seventh International Workshop on Languages for Inference - POPL 2022
0 users, 1 mentions 2022/01/10 12:08




安定したサイト運営のためにGitHub sponsorを募集しています。


  • 2021/12/31: デザインを刷新しました
  • 2021/04/08: 日本語Kaggleのカテゴリを新設しました