Long-Context Sequence Models for Image Retrieval

Xiao, Zilin

Long-Context Sequence Models for Image Retrieval

dc.contributor.advisor	Ordóñez-Román, Vicente	en_US
dc.creator	Xiao, Zilin	en_US
dc.date.accessioned	2025-01-16T20:48:28Z	en_US
dc.date.available	2025-01-16T20:48:28Z	en_US
dc.date.created	2024-12	en_US
dc.date.issued	2024-10-25	en_US
dc.date.submitted	December 2024	en_US
dc.date.updated	2025-01-16T20:48:28Z	en_US
dc.description.abstract	Image retrieval is an important problem in computer vision with many applications. In general, retrieval is usually cast as a metric learning problem where a model is trained under a distance or similarity objective to compare pairs of inputs. In this thesis, we introduce Extractive Image Re-ranker, a solution that takes as input local features corresponding to an image query and a group of gallery images, and outputs a refined ranking list through a single forward pass. This model can be used for image retrieval where typically a query image is compared to a large database of images using global features, and then a retrieved gallery of images is re-ranked based on more refined local features. ExtReranker formulates the re-ranking problem as a span extraction task analogous to the text span extraction problem in natural language processing. In contrast to pair-wise correspondence learning, our approach leverages long-context sequence models to effectively capture the list-wise dependencies between query and gallery images at the local-feature level. Our approach achieves superior performance compared with other re-rankers on established image retrieval benchmarks (CUB-200, SOP, and In-Shop). ExtReranker also achieves state-of-the-art re-ranking performance to alternative methods on ROxford and RParis while using 10X fewer local descriptors and having 5X lower forward latency.	en_US
dc.format.mimetype	application/pdf	en_US
dc.identifier.uri	https://hdl.handle.net/1911/118198	en_US
dc.language.iso	en	en_US
dc.subject	image retrieval	en_US
dc.subject	long-context language models	en_US
dc.title	Long-Context Sequence Models for Image Retrieval	en_US
dc.type	Thesis	en_US
dc.type.material	Text	en_US
thesis.degree.department	Computer Science	en_US
thesis.degree.discipline	Computer Science	en_US
thesis.degree.grantor	Rice University	en_US
thesis.degree.level	Masters	en_US
thesis.degree.name	Master of Science	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: XIAO-DOCUMENT-2024.pdf
Size:: 3.04 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 2 of 2

Name:: PROQUEST_LICENSE.txt
Size:: 5.84 KB
Format:: Plain Text
Description:

Download

Name:: LICENSE.txt
Size:: 2.98 KB
Format:: Plain Text
Description:

Download

Collections

Rice University Theses and Dissertations