Meta Approaches to Few-shot Image Classification

Date
2021-02-26
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract

Since the inception of deep Convolution Neural Network (CNN) architectures, we have seen a tremendous advancement in machine image classification. However, these methods require a large amount of data, sometimes in the order of millions, but often fail to generalize when the data set is small. And so, recently a new paradigm, called Few-Shot Learning', has been developed to tackle this problem. Essentially, the goal of few-shot learning is to develop techniques that can rapidly generalize to new tasks containing very few samples-- in extreme cases one (called one-shot) or zero (called zero-shot)-- with labels. In this work, I will be particularly tackling few-shot learning in the application of image classification. The most common approach to it is known as Meta-learning or learning to learn' where, rather than learning to solve a particular learning problem, the goal is to solve many learning problems in an attempt to learn how to learn to solve a particular type of problem. Another way to solve the problem is to re-purpose an existing learner for a new learning problem, known as Transfer learning. In my thesis, I propose two novel approaches, based on meta-learning and transfer learning, to tackle the few-shot (or one-shot) image classification. The first approach I propose is called meta-meta classification, where one uses a large set of learning problems to design an ensemble of learners, each of which has high bias and low variance and is skilled at solving a specific type of learning problems. The meta-meta classifier learns how to examine a given learning problem and combine the various learners to solve the problem. One type of image classification is the one-vs-all (OvA) classification problem, where only one image from the positive class is available for training along with images from a number of negative classes. I evaluate my approach on a one-shot, one-class-versus-all classification task and show that it is able to outperform traditional meta-learning as well as ensembling approaches. I evaluate my method using the popular 1,000 class Imagenet data (ILSVRC2012), the 200 class Caltech-UCSD Birds dataset, the 102 class FGVC-Aircraft dataset, and the 1,200 class Omniglot hand-written character dataset. I compare my results with a popular meta-learning algorithm, called model-agnostic meta learner (MAML), as well as an ensemble of multiple MAML models, and show my approach is able to outperform them in all the problems. The second approach we investigate uses the existing concept of transfer learning, where a simple Multi-Layer Perceptron (MLP) with a hidden layer is fine-tuned on top of pre-trained CNN backbones. Surprisingly, there have been very few works in the few-shot literature that have even examined the use of an MLP for fine-tuning pre-trained models (the assumption may be be that a hidden layer would provide too many parameters for few-shot learning). In order to avoid overfitting, we simply use an L2-regularizer. We argue that a diverse feature vector made of a variety of pre-trained libraries of models trained on a diverese dataset (such as, ILSVRC2012) is sufficiently capable of being re-purposed for small-data problems. We performed a series of experiments on both classification accuracy and feature behavior on multiple few-shot problems. We carefully picked the hyperparameters after validating on Caltech-UCSD Bird dataset and did our final evaluation on FGVC-Aircraft, FC100, Omniglot, Traffic Sign, FGCVx Fungi , QuickDraw, and VGG Flower datasets. Our experimental results showed significantly better performance compared to some baselines, such as, simple ensembling, standalone best model, as well as, some other competitive meta-learning techniques.

Description
Degree
Doctor of Philosophy
Type
Thesis
Keywords
Meta-learning, one-shot, few-shot, image classification, transfer learning, machine learning.
Citation

Chowdhury, Arkabandhu. "Meta Approaches to Few-shot Image Classification." (2021) Diss., Rice University. https://hdl.handle.net/1911/110261.

Has part(s)
Forms part of
Published Version
Rights
Copyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.
Link to license
Citable link to this page