Dynamic Sparsity for Efficient Machine Learning

Shrivastava, Anshumali2024-05-212024-05-212024-052024-04-15May 2024Liu, Zichang. Dynamic Sparsity for Efficient Machine Learning. (2024). PhD diss., Rice University. https://hdl.handle.net/1911/116108https://hdl.handle.net/1911/116108Over the past decades, machine learning(ML) models have delivered remarkable accomplishments in various applications. For example, large language models usher in a new wave of excitement in artificial intelligence. Interestingly, these accomplishments also unveil the scaling law in machine learning: larger models, equipped with more parameters and trained on more extensive datasets, often significantly outperform their smaller counterparts. However, the trends of increasing model size inevitably introduce unprecedented computation resource requirements, creating substantial challenges in model training and deployments. This thesis aims to improve the efficiency of ML models through algorithmic advancements. Specifically, we exploit the dynamic sparsity pattern inside ML models to achieve efficiency goals. Dynamic sparsity refers to the subset of parameters or activations that are important for a certain data, and different data may have a different dynamic sparsity pattern. We advocate identifying the dynamic sparsity pattern for each data set and focusing computation and memory resources on it. The first part of this thesis centers around the inference stage. We verify the existence of dynamic sparsity in trained ML models, namely, within the classification layer, attention mechanism, and transformer layers of trained models. Further, we demonstrate that such dynamic sparsity can be cheaply predicted and leveraged for each data to improve the inference efficiency goals. The subsequent part of the dissertation will shift its focus to the training stage, where dynamic sparsity emerges as a tool to mitigate the problem of catastrophic forgetting or data heterogeneity in federated learning to improve training efficiency.application/pdfengCopyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.Machine LearningLarge Language ModelSparsityDynamic Sparsity for Efficient Machine LearningThesis2024-05-21