Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning

Reinforcement learning methods trained on few environments rarely learn policies that generalize to unseen environments. To improve generalization, we incorporate the inherent sequential structure in reinforcement learning into the representation learning process. This approach is orthogonal to recent approaches, which rarely exploit this structure explicitly. Specifically, we introduce a theoretically motivated policy similarity metric (PSM) for measuring behavioral similarity between states. P

2 mentions: @agarwl_@agarwl_
Date: 2021/01/13 00:52

Referring Tweets

@agarwl_ Yayy, my first spotlight! ICLR'21 AC: "The reviewers unanimously praised the work in terms of theory, algorithm and empirical evaluation. This is a novel and technically deep contribution that advances the SOTA for RL generalization." #tweeprint soon.
@agarwl_ @bucketofkets @rico_jski [Self-Plug] We do have a contrastive method (although with a bit of sequential aspect of RL baked in) perform quite well on this benchmark.

Related Entries

Read more Disentangled Representation Learning for Non-Parallel Text Style Transfer - Speaker Deck
0 users, 1 mentions 2019/11/02 06:51
Read more [1905.01072] Deep Residual Reinforcement Learningcontact arXivarXiv Twitter
0 users, 1 mentions 2020/04/10 12:55
Read more seyonec/ChemBERTa-zinc-base-v1 · Hugging Face
0 users, 1 mentions 2020/06/29 06:52
Read more Contrastive LearningとクラスタリングでSOTA!? (画像の表現学習2020夏特集3) | AI-SCHOLAR | AI:(人工知能)論文・技術情報メディア
0 users, 0 mentions 2020/07/22 12:14
Read more NFL Big Data Bowl 2021 | Kaggle
0 users, 1 mentions 2021/01/07 21:50