[1909.04630] Meta-Learning with Implicit Gradients

A core capability of intelligent systems is the ability to quickly learn new tasks by drawing on prior experience. Gradient (or optimization) based meta-learning has recently emerged as an effective approach for few-shot learning. In this formulation, meta-parameters are learned in the outer loop, while task-specific models are learned in the inner-loop, by using only a small amount of data from the current task. A key challenge in scaling these approaches is the need to differentiate through the inner loop learning process, which can impose considerable computational and memory burdens. By drawing upon implicit differentiation, we develop the implicit MAML algorithm, which depends only on the solution to the inner level optimization and not the path taken by the inner loop optimizer. This effectively decouples the meta-gradient computation from the choice of inner loop optimizer. As a result, our approach is agnostic to the choice of inner loop optimizer and can gracefully handle many

7 mentions: @chelseabfinn@hillbig@hillbig@eliza_luth@GeeksDataa@KaisensData@4indata
Date: 2019/09/11 02:17

Referring Tweets

@hillbig 陰関数微分を使うことで、メタ学習で各タスクの学習過程を陽に展開せずにメタパラメータの勾配を計算でき、メモリ使用量を固定にできる(iMAML) t.co/ZnFvrddyGY  同様の考えは階層ベイズによるメタ学習でも、各タスクで予測尤度を最大化する時に使える t.co/sp3bcYYmxj
@hillbig Meta-learning requires inner-loop optimization for each task, and implicit differentiation can the gradient directly from the solution. (iMAML) t.co/ZnFvrddyGY Similar idea is also proposed in hierarchical Bayesian meta-learning setting. t.co/sp3bcYYmxj
@chelseabfinn It's hard to scale meta-learning to long inner optimizations. We introduce iMAML, which meta-learns *without* differentiating through the inner optimization path using implicit differentiation. t.co/hOSi3Irymh to appear @NeurIPSConf w/ @aravindr93 @ShamKakade6 @svlevine t.co/fBznTaubgr
@eliza_luth In most meta-learning methods, tasks are implicitly related via the shared model/optimizer. We show that a meta-learner that explicitly relates tasks on a graph can significantly improve the performance of few-shot learning. t.co/m2fBHgBB35 to appear @NeurIPSConf t.co/xwzP4B0UtF