Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples

ICLR 20 10-23-2020

This is more like an empirical paper. It proposes a large-scale and diverse benchmark for measuring the competence of different image classification models in a realistic and challenging few-shot setting, offering a framework in which one can investigate several important aspects of few-shot classification. It is composed of 10 publicly available datasets of natural images, handwritten characters and doodles.

In Meta-Dataset, in addition to the tough generalization challenge to new classes inherent in the few-shot learning setup described above, we also study generalization to entirely new datasets, from which no images of any class were seen in training.

It introduces a sampling algorithm for generating tasks of varying characteristics and difficulty, by varying the number of classes in each task, the number of available examples per class, introducing class imbalances and, for some datasets, varying the degree of similarity between the classes of each task. Some example test tasks from Meta-Dataset are shown below.

We benchmark two main families of few-shot learning models on Meta-Dataset: pre-training and meta-learning.

Pre-training simply trains a classifier (a neural network feature extractor followed by a linear classifier) on the training set of classes using supervised learning. Then, the examples of a test task can be classified either by fine-tuning the pre-trained feature extractor and training a new task-specific linear classifier, or by means of kNN, where the prediction for each query example is the label of its nearest support example. The authors test several representative meta-learning models (e.g., matching network, Prototype network, MAML) on this dataset and give the analysis. Furthermore, the authors combine MAML and Prototype network, which achieves the best performance on this new dataset.

Proto-MAML performs a simple modification over the original MAML algorithm: the linear classification layer for each task is initialized from the prototypical layer. This is the only difference compared to vanilla MAML. Intuitively then, the aim is to meta-learn the embedding weights such that: given a new task, initializing the output layer from the prototypes of that task and performing a few adaptation steps on the embedding and output layer suffice for performing well on the query set of that task.

Summaries of some findings for pretraining and meta learning models

  • Existing approaches have trouble leveraging heterogeneous training data sources.

  • Some models are more capable than others of exploiting additional data at test time.

  • The adaptation algorithm of a meta-learner is more heavily responsible for its performance than the fact that it is trained end-to-end (i.e. meta-trained).

This paper provides a very good resources for the research of several different perspective in few shot learning: a more realistic variant of few-shot classification, including underemphasized aspects such as variable shots and ways, class imbalance, class structure, and heterogeneous data.

Reference

Last updated