Meta-Learning Representation for Continual Learning

NeurIPS 2019 8-24-2020

Motivation

Neural Networks trained online on a correlated stream of data suffer from catastrophic forgeting. We propose learning a representation that is robust to forgeting. To learn the representation, we propose OML, a second-order meta-learning objective that directly minimizes interference. Highly sparse representations naturally emerge by minimizing our proposed objective.

Two parts in the model as shown from the figure:

ϕθ\phi_{\theta} and gwg_w

two parts: RLN and PLN

Meta-parameters: A deep neural network that transforms high-dimensional input data to a representation Rd\mathcal{R}^d which is more conducive for continual learning

Adaptation parameters: A simple neural network that learns continually from Rd\mathcal{R}^d

Meta-Training:

Step 1: adaptation (Inner loop updates only for gwg_w )

Step 2: compute Meta-Loss on the complete task dataset (similar to MAML)

Step 3: Meta-update: differentiating meta-loss through the adaptation phase (for ϕθ\phi_\theta and gwg_w ) [similar to MAML]

Meta-Testing:

Step 1: Adaptation (Inner loop updates only for gwg_w , use ϕθ\phi_\theta and gwg_w learnt from Meta-training)

Step 2: Evaluation (fix ϕθ\phi_\theta and gwg_w, to compute the accuracy)

Note:

  • check online meta learning paper again, compare the difference

  • attention: domain shift problem

  • two parts architectures: it looks very common in many papers

Reference:

Last updated

Was this helpful?