[2006.10701] Deep Reinforcement Learning amidst Lifelong Non-Stationarityopen searchopen navigation menucontact arXivsubscribe to arXiv mailings

As humans, our goals and our environment are persistently changing throughout our lifetime based on our experiences, actions, and internal and external drives. In contrast, typical reinforcement learning problem set-ups consider decision processes that are stationary across episodes. Can we develop reinforcement learning algorithms that can cope with the persistent change in the former, more realistic problem settings? While on-policy algorithms such as policy gradients in principle can be extended to non-stationary settings, the same cannot be said for more efficient off-policy algorithms that replay past experiences when learning. In this work, we formalize this problem setting, and draw upon ideas from the online learning and probabilistic inference literature to derive an off-policy RL algorithm that can reason about and tackle such lifelong non-stationarity. Our method leverages latent variable models to learn a representation of the environment from current and past experiences,

1 mentions: @chelseabfinn
Date: 2020/06/27 03:51

Related Entries

Read more [2001.06782] Gradient Surgery for Multi-Task Learningcontact arXivarXiv Twitter
0 users, 1 mentions 2020/01/22 03:50
Read more [1910.09772] Weakly Supervised Disentanglement with Guaranteescontact arXivarXiv Twitter
0 users, 1 mentions 2020/04/07 20:21
Read more [2004.02860] Weakly-Supervised Reinforcement Learning for Controllable Behaviorcontact arXivarXiv Tw...
0 users, 1 mentions 2020/04/07 20:21
Read more [2007.02931] Adaptive Risk Minimization: A Meta-Learning Approach for Tackling Group Shiftopen searc...
0 users, 1 mentions 2020/07/07 03:51
Read more [1911.08731] Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regulari...
0 users, 1 mentions 2020/07/07 03:51