Compiler and Runtime Optimization of Computational Kernels for Irregular Applications

dc.contributor.advisorMellor-Crummey, John
dc.contributor.advisorBudimlić, Zoran
dc.contributor.committeeMemberVarman, Peter J
dc.contributor.committeeMemberMamouras, Konstantinos
dc.creatorMilakovic, Srdan
dc.date.accessioned2023-09-01T19:33:35Z
dc.date.created2023-08
dc.date.issued2023-08-17
dc.date.submittedAugust 2023
dc.date.updated2023-09-01T19:33:36Z
dc.descriptionEMBARGO NOTE: This item is embargoed until 2024-08-01
dc.description.abstractMany computationally-intensive workloads do not fit on individual compute nodes due to their size. As a consequence, such workloads are usually executed on multiple heterogenous compute nodes of a cluster or supercomputer. However, due to the complexity of the hardware, developing efficient and scalable code for modern compute nodes is difficult. Another challenge with sophisticated applications is that data structures, communication, and control patterns are often irregular and unknown before the program execution. Lack of regularity makes static analysis especially difficult or very often impossible. To overcome these issues, programmers use high-level and implicitly parallel programming models or domain-specific libraries that consist of composable building blocks. This dissertation explores compiler and runtime optimizations for automatic granularity selection in the context of two programming paradigms: Concurrent Collections (CnC)---a declarative,dynamic single-assignment, data-race free programming model---and GraphBLAS--a domain-specific Application-specific Programming Interface (API)---. Writing fine-grained CnC programs is easy and intuitive for domain experts because the programmers do not have to worry about parallelism. Additionally, fine-grained programs expose maximum parallelism. However, fine-grained programs can significantly increase the runtime overhead of CnC program execution due to a large number of data accesses and dependencies between computation tasks with respect to the amount of computation that is done by a fine-grained task. Runtime overhead can be reduced by coarsening the data accesses and task dependencies. However, coarsening is usually tedious, and it is not easy even for domain experts. For some applications, the coarse-grained code can be generated by a compiler. However, not all fine-grained applications can be converted to coarse-grained applications because not all information is statically known. In this dissertation, we introduce the concept of micro-runtimes. A micro-runtime is a Hierarchical CnC construct that enables fusion of multiple steps into a higher-level step during program execution. Another way for users to develop applications that efficiently exploit modern hardware is through domain-specific APIs that define composable building blocks. One such API specification is GraphBLAS. GraphBLAS allows users to specify graph algorithms using (sparse) linear algebra building blocks. Even though GraphBLAS libraries usually consist of highly hand-optimized building blocks, GraphBLAS libraries provide limited or no support for inter-kernel optimization. In this dissertation, we investigate multiple different approaches for inter-kernel optimization, including runtime optimizations and compile-time optimizations. Our optimizations reduce the number of arithmetic operations, memory accesses, and memory required for temporary objects.
dc.embargo.lift2024-08-01
dc.embargo.terms2024-08-01
dc.format.mimetypeapplication/pdf
dc.identifier.citationMilakovic, Srdan. "Compiler and Runtime Optimization of Computational Kernels for Irregular Applications." (2023) Diss., Rice University. https://hdl.handle.net/1911/115234.
dc.identifier.urihttps://hdl.handle.net/1911/115234
dc.language.isoeng
dc.rightsCopyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.
dc.subjectConcurrent Collections
dc.subjectGraph BLAS
dc.subjectOperator Fusion
dc.titleCompiler and Runtime Optimization of Computational Kernels for Irregular Applications
dc.typeThesis
dc.type.materialText
thesis.degree.departmentComputer Science
thesis.degree.disciplineEngineering
thesis.degree.grantorRice University
thesis.degree.levelDoctoral
thesis.degree.nameDoctor of Philosophy
Files
License bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
5.84 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.98 KB
Format:
Plain Text
Description: