Automated Deep Learning Algorithm and Accelerator Co-search for Both Boosted Hardware Efficiency and Task Accuracy

dc.contributor.advisorLin, Yingyanen_US
dc.creatorZhang, Yonganen_US
dc.date.accessioned2023-05-31T20:30:19Zen_US
dc.date.available2023-05-31T20:30:19Zen_US
dc.date.created2023-08en_US
dc.date.issued2023-04-24en_US
dc.date.submittedAugust 2023en_US
dc.date.updated2023-05-31T20:30:19Zen_US
dc.description.abstractPowerful yet complex deep neural networks (DNNs) have fueled a booming demand for efficient DNN solutions to bring DNN-powered intelligence into numerous applications. Jointly optimizing the networks and their accelerators are promising in providing optimal performance. However, the great potential of such solutions have yet to be unleashed due to the challenge of simultaneously exploring the vast and entangled, yet different design spaces of the networks and their accelerators. To this end, we propose DIAN, a DIfferentiable Accelerator-Network co-search framework for automatically searching for matched networks and accelerators to maximize both the task accuracy and acceleration efficiency. Specifically, DIAN integrates two enablers: (1) a generic design space for DNN accelerators that is applicable to both FPGA- and ASIC-based DNN accelerators and compatible with DNN frameworks such as PyTorch to enable algorithmic exploration for more efficient DNNs and their accelerators; and (2) a joint DNN network and accelerator co-search algorithm that enables the simultaneous search for optimal DNN structures and their accelerators’ micro-architectures and mapping methods to maximize both the task accuracy and acceleration efficiency. Experiments and ablation studies based on FPGA measurements and ASIC synthesis show that the matched networks and accelerators generated by DIAN consistently outperform state-of-the-art (SOTA) DNNs and DNN accelerators (e.g., 3.04× better FPS with a 5.46% higher accuracy on ImageNet), while requiring notably reduced search time (up to 1234.3×) over SOTA co-exploration methods, when evaluated over ten SOTA baselines on three datasets.en_US
dc.format.mimetypeapplication/pdfen_US
dc.identifier.citationZhang, Yongan. "Automated Deep Learning Algorithm and Accelerator Co-search for Both Boosted Hardware Efficiency and Task Accuracy." (2023) Master’s Thesis, Rice University. <a href="https://hdl.handle.net/1911/114896">https://hdl.handle.net/1911/114896</a>.en_US
dc.identifier.urihttps://hdl.handle.net/1911/114896en_US
dc.language.isoengen_US
dc.rightsCopyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.en_US
dc.subjectDeep Learningen_US
dc.subjectHardware Accelerationen_US
dc.subjectHardware Design Automationen_US
dc.subjectAlgorithm-Hardware Co-designen_US
dc.titleAutomated Deep Learning Algorithm and Accelerator Co-search for Both Boosted Hardware Efficiency and Task Accuracyen_US
dc.typeThesisen_US
dc.type.materialTexten_US
thesis.degree.departmentElectrical and Computer Engineeringen_US
thesis.degree.disciplineEngineeringen_US
thesis.degree.grantorRice Universityen_US
thesis.degree.levelMastersen_US
thesis.degree.nameMaster of Scienceen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ZHANG-DOCUMENT-2023.pdf
Size:
2.33 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
5.84 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.61 KB
Format:
Plain Text
Description: