Incremental Few Shot Learning with Attention Attractor Network

NeurIPS 2019 9/7/2020

Motivation

Machine learning classifiers are often trained to recognize a set of pre-defined classes.

However, in many applications, it is often desirable to have the flexibility of learning additional concepts, with limited data and without re-training on the full training set.

This paper addresses this problem, incremental few-shot learning, where a regular classification network has already been trained to recognize a set of base classes, and several extra novel classes are being considered, each with only a few labeled examples.

The model in the paper can help recognize novel classes while remembering old classes without the need to review the original training set, outperforming various baselines.

Some definitions

Large datasets need detailed annotation, requiring intensive human labor.

Human learning: new concepts can be learned from very few examples.

Few shot learning (FSL): bridge these gaps. A model learns to output a classifier given only a few labeled examples of the unseen classes.

  • limitation: Few-shot models only focus on learning novel classes, ignoring the fact that many common classes are readily available in large datasets.

Incremental few-shot learning (low-shot learning): aims to enjoy the best of both worlds, the ability to learn from large datasets for common classes with the flexibility of few-shot learning for others.

Some researchers starts off with a pre-trained network on a set of base classes, and tries to augment the classifier with a batch of new classes that has not been seen during training to learn a good representation.

It combines:

  • Incremental learning: we want to add new classes without catastrophic forgetting.

    • a setting where information is arriving continuously while prior knowledge needs to be transferred.

    • A key challenge is catastrophic forgetting i.e., the model forgets the learned knowledge.

  • FSL: (when the new classes, unlike the base classes, only have a small amount of examples)

Meta Learning: a machine learning paradigm where the meta-learner tries to improve the base learner using the learning experiences from multiple tasks.

  • limitation: Meta-learning methods typically learn the update policy yet lack an overall learning objective in the few-shot episodes. (???)

  • Furthermore, they could potentially suffer from short-horizon bias if at test time the model is trained for longer steps.

Model in the paper

  • Incremental Few-Shot Learning

  • Attention Attractor Networks

Incremental FSL:

Pre-training stage:

Incremental FS Episodes:

Meta-Learning Stage:

Joint Prediction on base and novel classes:

This is not gradient-based approaches,

seems more like metric or model based

Note:

how did they set their experiments? difference with others?

Reference

Last updated