Memory efficient computation for large scale machine learning and data inference

Date
2022-08-11
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract

With the fast growth of large scale and high-dimensional datasets, large-scale machine learning and statistical inference become more and more common in many daily applications. Although the development of modern computation hardware (like GPUs) has brought an exponential speed-up of computation efficiency, memory price is still expensive and has become a main bottleneck for these large scale learning or inference tasks. The thesis focus on developing scalable and memory-efficient learning and inference algorithms with probabilistic data structures.

We first aim to solve the low memory and high-speed membership testing problem. Membership testing tries to answer whether a query q is in a set S. Membership testing has a lot of applications in web services, such as malicious URL testing and search query caching. However, due to the limited memory budget and constrained response time, the membership testing has to be fast and memory efficient. We propose two learned Bloom filter algorithms, which smartly combine the machine learning classifier with Bloom filters, to achieve low memory usage, high inference speed, and the state-of-art inference FPR.

Secondly, we show a novel use of the probabilistic data structure (Count Sketch) to solve the high-dimensional covariance matrix estimation problem. High-dimensional covariance matrix estimation plays a critical role in many machine learning and statistical inference problems. However, the memory cost of storing a covariance matrix increases quadratically with the dimension. Hence, when the dimension increases to the scale of millions, storing the whole covariance matrix in the memory is almost impossible. However, the sparsity nature of most high-dimensional covariance matrices give us hope to only recover the large covariance entries. We incorporate active sampling into the Count Sketch algorithm to project the covariances into a compressed data structure. It only costs sub-linear memory while is able to locate the large covariance entries with high accuracy.

Finally, we explore the memory and communication efficient algorithms for extreme classification tasks under the federated learning setup. Federated learning enables many local devices to train a deep learning model jointly without sharing the local data. Currently, most federated training schemes learn a global model by averaging the parameters of local models. However, it suffers from high communication costs resulting from transmitting full local model parameters. Especially for the federated learning tasks involving extreme classification, 1) communication becomes the main bottleneck since the model size increases proportionally to the number of output classes; 2) extreme classification (such as user recommendation) normally has extremely imbalanced classes and heterogeneous data on different devices. We propose to reduce the model size by compressing the output classes with Count Sketch. It can significantly reduce the memory usage while still being able to maintain the information of the major classes.

Description
Degree
Doctor of Philosophy
Type
Thesis
Keywords
Randomized Algorithm, Large Scale Machine Learning, Large Scale Data Inference, Sublinear Memory
Citation

Dai, Zhenwei. "Memory efficient computation for large scale machine learning and data inference." (2022) Diss., Rice University. https://hdl.handle.net/1911/113281.

Has part(s)
Forms part of
Published Version
Rights
Copyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.
Link to license
Citable link to this page