Large-Scale Long-Tailed Recognition in an Open World
9-25-2020
Last updated
9-25-2020
Last updated
Real world data often have a long-tailed and open-ended distribution.
Three common problems and their limitations:
Imbalanced classification: not sensitive to novelty
Few shot learning: cannot avoid forgetting
Open set recognition (OOD detection): cannot transfer knowledge
A practical recognition system must classify among majority and minority classes, generalize from a few known instances, and acknowledge novelty upon a never seen instance.
This paper defines Open Long-Tailed Recognition (OLTR) as learning from such naturally distributed data and optimizing the classification accuracy over a balanced test set which include head, tail, and open classes.
OLTR must handle imbalanced classification, few-shot learning, and open-set recognition in one integrated algorithm, whereas existing classification approaches focus only on one aspect and deliver poorly over the entire class spectrum.
The key challenges are how to share visual knowledge between head and tail classes and how to reduce confusion between tail and open classes.
OLTR could realize knowledge transfer, sensitivity to novelty, and avoid forgetting in a unified form.
We develop an integrated OLTR algorithm that maps an image to a feature space such that visual concepts can easily relate to each other based on a learned metric that respects the closed-world classification while acknowledging the novelty of the open world. Our so-called dynamic meta-embedding combines a direct image feature and an associated memory feature, with the feature norm indicating the familiarity to known classes.
Firstly, a visual memory is obtained by aggregating the knowledge from both head and tail classes.
Secondly, the visual concepts stored in the memory are infused back as associated memory feature to enhance the original direct feature. It can be understood as using induced knowledge (i.e. memory feature) to assist the direct observation (i.e. direct feature).
We further learn a concept selector(e in the bottom figure) to control the amount and type of memory feature to be infused. Since head classes already have an abundant direct observation, only a small amount of memory feature is infused for them. On the contrary, tail classes suffer from scarce observation, the associated visual concepts in memory feature are extremely beneficial.
Finally, we calibrate the confidence of open classes by calculating their reachability(No.3 in the bottom figure) to the obtained visual memory