Dynamic Sparsity for Efficient Machine Learning

dc.contributor.advisorShrivastava, Anshumalien_US
dc.creatorLiu, Zichangen_US
dc.date.accessioned2024-05-21T21:26:26Zen_US
dc.date.available2024-05-21T21:26:26Zen_US
dc.date.created2024-05en_US
dc.date.issued2024-04-15en_US
dc.date.submittedMay 2024en_US
dc.date.updated2024-05-21T21:26:26Zen_US
dc.description.abstractOver the past decades, machine learning(ML) models have delivered remarkable accomplishments in various applications. For example, large language models usher in a new wave of excitement in artificial intelligence. Interestingly, these accomplishments also unveil the scaling law in machine learning: larger models, equipped with more parameters and trained on more extensive datasets, often significantly outperform their smaller counterparts. However, the trends of increasing model size inevitably introduce unprecedented computation resource requirements, creating substantial challenges in model training and deployments. This thesis aims to improve the efficiency of ML models through algorithmic advancements. Specifically, we exploit the dynamic sparsity pattern inside ML models to achieve efficiency goals. Dynamic sparsity refers to the subset of parameters or activations that are important for a certain data, and different data may have a different dynamic sparsity pattern. We advocate identifying the dynamic sparsity pattern for each data set and focusing computation and memory resources on it. The first part of this thesis centers around the inference stage. We verify the existence of dynamic sparsity in trained ML models, namely, within the classification layer, attention mechanism, and transformer layers of trained models. Further, we demonstrate that such dynamic sparsity can be cheaply predicted and leveraged for each data to improve the inference efficiency goals. The subsequent part of the dissertation will shift its focus to the training stage, where dynamic sparsity emerges as a tool to mitigate the problem of catastrophic forgetting or data heterogeneity in federated learning to improve training efficiency.en_US
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationLiu, Zichang. Dynamic Sparsity for Efficient Machine Learning. (2024). PhD diss., Rice University. https://hdl.handle.net/1911/116108en_US
dc.identifier.urihttps://hdl.handle.net/1911/116108en_US
dc.language.isoengen_US
dc.rightsCopyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.en_US
dc.subjectMachine Learningen_US
dc.subjectLarge Language Modelen_US
dc.subjectSparsityen_US
dc.titleDynamic Sparsity for Efficient Machine Learningen_US
dc.typeThesisen_US
dc.type.materialTexten_US
thesis.degree.departmentComputer Scienceen_US
thesis.degree.disciplineEngineeringen_US
thesis.degree.grantorRice Universityen_US
thesis.degree.levelDoctoralen_US
thesis.degree.nameDoctor of Philosophyen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
LIU-DOCUMENT-2024.pdf
Size:
1.98 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
5.84 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.98 KB
Format:
Plain Text
Description: