Automatic and Interactive Parallelization

McKinley, Kathryn

Automatic and Interactive Parallelization

dc.contributor.author	McKinley, Kathryn
dc.date.accessioned	2017-08-02T22:03:17Z
dc.date.available	2017-08-02T22:03:17Z
dc.date.issued	1994-03
dc.date.note	March 1994
dc.description	This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/16540
dc.description.abstract	The goal of this dissertation is to give programmers the ability to achieve high performance by focusing on developing parallel algorithms, rather than on architecture-specific details. The advantages of this approach also include program portability and legibility. To achieve high performance, we provide automatic compilation techniques that tailor parallel algorithms to shared-memory multiprocessors with local caches and a common bus. In particular, the compiler maps complete applications onto the specifics of a machine, exploiting both parallelism and memory. To optimize complete applications, we develop novel, general algorithms to transform loops that contain arbitrary conditional control flow. In addition, we provide new inter procedural transformations which enable optimization across procedure boundaries. These techniques provide the basis for a robust automatic parallelizing algorithm that is applicable to complete programs. The algorithm for automatic parallel code generation takes into consideration the interaction of parallelism and data locality , as well as the overhead of parallelism. The algorithm is based on a simple cost model that accurately predicts cache line reuse from multiple accesses to the same memory location and from consecutive accesses. The optimizer uses this model to I prove data locality. It also uses the model to discover and introduce effective parallelism that complements the benefits of data locality. The optimizer further improves the effectiveness of parallelism by seeking to increase its granularity. Parallelism is introduced only when granularity is sufficient to overcome its associated. costs. The algorithm for parallel code generation is shown to be efficient and several of its component algorithms are proven optimal. The efficacy of the optimizer is illustrated with experimental results. In most cases, it is very effective and either achieves or improves the performance of hand-crafted parallel programs. When performance is not satisfactory, we provide an interactive parallel programming tool which combines compiler analysis and algorithms with human expertise.
dc.format.extent	165 pp
dc.identifier.citation	McKinley, Kathryn. "Automatic and Interactive Parallelization." (1994) https://hdl.handle.net/1911/96425.
dc.identifier.digital	TR92-182
dc.identifier.uri	https://hdl.handle.net/1911/96425
dc.language.iso	eng
dc.rights	You are granted permission for the noncommercial reproduction, distribution, display, and performance of this technical report in any format, but this permission is only for a period of forty-five (45) days from the most recent time that you verified that this technical report is still available from the Computer Science Department of Rice University under terms that include this permission. All other rights are reserved by the author(s).
dc.title	Automatic and Interactive Parallelization
dc.type	Technical report
dc.type.dcmi	Text

Files

Original bundle

Now showing 1 - 1 of 1

Name:: TR92-182.pdf
Size:: 8.16 MB
Format:: Adobe Portable Document Format

Download

Collections

Computer Science Technical Reports