More from Less: Learning with Limited Annotated Data in Vision and Language

Date
2024-04-18
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract

Deep Learning has significantly impacted how we train models that can effectively interpret and model the world around us. So far, we have achieved unprecedented advances due to a massive amount of effort in collecting and curating annotated samples from the real world. However, the research community knows about the unsustainability of this approach and instead has turned its attention to collecting large amounts of unlabeled, noisy, and weakly supervised data. Such "data in the wild" can be gathered through textual descriptions and images from the Internet. Unfortunately, this Internet content is constrained by the interactions of a portion of the global population. This limitation inevitably affects the diversity of the available data, also impacting specialized knowledge. In this thesis, we aim to develop techniques for training efficient intelligent systems using limited amounts of labeled data, and explore to which extent alternative sources of data, such as synthetic images, can be leveraged to learn useful skills and representations. In particular, this thesis aims to address four key research questions; how can we: (a) learn with limited annotated data, (b) learn to augment the available data, (c) learn to generalize to novel data, and (d) utilize alternative data sources while ensuring privacy through synthetic data generation? In the context of learning with limited annotated data, we propose a pseudo-labeling approach that exploits curriculum learning principles to achieve robustness against out-of-distribution data. We also investigate methods to learn robust compositional representations, employing data augmentation techniques to expand the underlying knowledge present in observed data. Furthermore, we explore zero-shot learning to generalize to novel data, analyzing different methods and feature alignment techniques. Finally, we address the challenge of modeling the world without compromising privacy and ethical principles by generating realistic synthetic data, which has been proven useful in training models to perform well under real test data. We hope that the outcomes of this thesis contribute to a broader vision of creating novel algorithms that can exploit and control diverse sources of data, allowing for the development of unbiased and truthful knowledge and information for training deep learning models.

Description
Degree
Doctor of Philosophy
Type
Thesis
Keywords
Computer Vision, Natural Language Processing, Machine Learning
Citation

Cascante-Bonilla, Paola. More from Less: Learning with Limited Annotated Data in Vision and Language. (2024). PhD diss., Rice University. https://hdl.handle.net/1911/116201

Has part(s)
Forms part of
Published Version
Rights
Copyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.
Link to license
Citable link to this page