Array syntax compilation and performance tuning

Date
2007
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract

Array syntax adds expressive power to a language by providing operations on and assignments to array sections. Thus it allows programmers to write clear and concise code. However, state-of-the-art vendor compilers fail to efficiently map array statements to underlying architectures for high performance. The inefficiency is caused by ineffectively solving the following three technical problems: (1) reducing the size of allocated temporary array; (2) extending solutions to the evolving architectures; (3) applying loop fusion to multiple array statements. Finding solutions to these problems is important because otherwise array syntax, though a high-level language feature, may not be widely used by application developers. To address the above problems, this research first develops a novel strategy that minimizes the allocated temporary arrays using loop alignment and loop skewing on scalar processors, thereby reducing memory traffic and improving cache utilization. It then extends the minimization strategy to exploit the increasing on-chip parallelism on evolving architectures that offer vector (e.g., SSE and AltiVec) and multi-core (e.g., CELL) capabilities. In addition, new techniques boost performance by improving data alignment and managing data movement, both of which are important on these new architectures. Last, this dissertation parameterizes loop fusion for performance tuning and explores the properties of the space of all possible loop fusion configurations, to expedite performance tuning of loop fusion for increasing data reuse across multiple array statements. These transformations and optimizations are implemented in a source-to-source research compiler with extensions to target short vector processors and CELL processor. Experiments show that array statements compiled with our strategy run as much as two times faster than those compiled directly by vendor compilers. Our exploration of loop fusion parameter space identifies good candidates for heuristic searching and space pruning, which are essential to make the performance tuning process practical. In summary, this dissertation demonstrates that advanced compilation techniques can significantly improve the performance of programs written in array syntax upon current state-of-the-art implementation across a variety of architectures, including the latest multi-core processors with vector capabilities.

Description
Degree
Doctor of Philosophy
Type
Thesis
Keywords
Computer science
Citation

Zhao, Yuan. "Array syntax compilation and performance tuning." (2007) Diss., Rice University. https://hdl.handle.net/1911/20676.

Has part(s)
Forms part of
Published Version
Rights
Copyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.
Link to license
Citable link to this page