[2010.07893] An Alternative to Backpropagation in Deep Reinforcement Learningopen searchopen navigation menucontact arXivsubscribe to arXiv mailings

State-of-the-art deep learning algorithms mostly rely on gradient backpropagation to train a deep artificial neural network, which is generally regarded to be biologically implausible. For a network of stochastic units trained on a reinforcement learning task or a supervised learning task, one biologically plausible way of learning is to train each unit by REINFORCE. In this case, only a global reward signal has to be broadcast to all units, and the learning rule given is local, which can be interpreted as reward-modulated spike-timing-dependent plasticity (R-STDP) that is observed biologically. Although this learning rule follows the gradient of return in expectation, it suffers from high variance and cannot be used to train a deep network in practice. In this paper, we propose an algorithm called MAP propagation that can reduce this variance significantly while retaining the local property of learning rule. Different from prior works on local learning rules (e.g. Contrastive Divergen

1 mentions: @evolvingstuff
Date: 2020/10/17 02:22

Referring Tweets

@evolvingstuff An Alternative to Backpropagation in Deep Reinforcement Learning "We show that the newly proposed algorithm can solve common reinforcement learning tasks at a speed similar to that of backpropagation when applied to an actor-critic network." t.co/MS0js4ae5w t.co/3ZKX9Uh9eC

Related Entries

Read more GitHub - williamFalcon/pytorch-lightning: Rapid research framework for PyTorch. The researcher's ver...
0 users, 2 mentions 2019/09/07 05:17
Read more [2003.13616] Difference Attention Based Error Correction LSTM Model for Time Series Predictioncontac...
0 users, 2 mentions 2020/03/31 15:51
Read more [2010.04029] RNNLogic: Learning Logic Rules for Reasoning on Knowledge Graphsopen searchopen navigat...
0 users, 2 mentions 2020/10/10 17:21
Read more [2010.12527] Retrieve, Rerank, Read, then Iterate: Answering Open-Domain Questions of Arbitrary Comp...
0 users, 1 mentions 2020/10/26 02:21
Read more [2010.14439] Differentiable Open-Ended Commonsense Reasoningopen searchopen navigation menucontact a...
0 users, 1 mentions 2020/10/28 03:51