Locality Sensitive Sampling for Extreme-Scale Optimization and Deep Learning

Chen, Beidi

Locality Sensitive Sampling for Extreme-Scale Optimization and Deep Learning

Files

CHEN-DOCUMENT-2020.pdf (7.99 MB)

Date

2020-08-11

Authors

Chen, Beidi

Abstract

The exponential growth of data poses a number of challenges for scaling learning algorithms in machine learning and deep learning problems. This thesis aims to explore and tackle the computational challenges with randomized hashing algorithms and shed new light on Locality Sensitive Hashing (LSH) as an adaptive sampler for large-scale estimations. We first introduce the chicken-and-the-egg loop problem in large-scale optimization algorithms. SGD estimates the gradient by uniform sampling with sample size one. There have been several other works that suggest faster epoch-wise convergence by using weighted non-uniform sampling for better gradient estimates. Unfortunately, the per-iteration cost of maintaining this adaptive distribution for gradient estimation is more than calculating the full gradient itself. We break this barrier by providing the first demonstration of an LSH sampled stochastic gradient descent (LGD) that leads to superior gradient estimation while keeping the sampling cost per iteration similar to that of the uniform sampling. Then we demonstrate the power of LSH Sampling in our SLIDE (SUb-LInear Deep learning Engine), which drastically reduces the computations of extreme-scale neural network training and outperforms an optimized implementation of Tensorflow (TF) on the best available GPU with only a CPU. Our evaluations on industry-scale recommendation datasets, with large fully connected architectures, show that training with SLIDE on a 44 core CPU is more than 3.5 times (1 hour vs. 3.5 hours) faster than the same network trained using TF on Tesla V100 at any given accuracy level. In addition to exploring these new possibilities for LSH, we also develop several classical hashing algorithms in the literature.

Advisor

Shrivastava, Anshumali

Degree

Doctor of Philosophy

Type

Thesis

Keywords

Deep Learning, Locality Sensitive Hashing

Citation

Chen, Beidi. "Locality Sensitive Sampling for Extreme-Scale Optimization and Deep Learning." (2020) Diss., Rice University. https://hdl.handle.net/1911/109187.

Rights

Copyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.

Citable link to this page

https://hdl.handle.net/1911/109187

Collections

Rice University Electronic Theses and Dissertations

Full item page