An Optimizing Fortran D Compiler for MIMD Distributed-Memory Machines

dc.contributor.authorTseng, Chau-Wenen_US
dc.date.accessioned2017-08-02T22:03:19Zen_US
dc.date.available2017-08-02T22:03:19Zen_US
dc.date.issued1993-01en_US
dc.date.noteJanuary 1993en_US
dc.descriptionThis work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/16677en_US
dc.description.abstractMassively parallel MIMD distributed-memory machines can provide enormous computational power; however, the difficulty of developing parallel programs for these machines has limited their use. Our thesis is that an advanced compiler can generate efficient parallel programs, if data decompositions are provided. To validate this thesis, we have implemented a compiler for Fortran D, a version of Fortran that provides data decomposition specifications at two levels: problem mapping using sophisticated array alignments, and machine mapping through a rich set of data distribution functions. The Fortran D compiler is organized around three major functions: program analysis, program optimization, and code generation. Its compilation strategy is based on the "owner computes" rule, where each processor only computes values of data it owns. Data decomposition specifications are translated into mathematical distribution functions that determine the ownership of local data. By composing these with subscript functions or their inverses, the compiler can efficiently partition computation and determine nonlocal accesses at compile-time. Fortran D optimizations are guided by the concept of data dependence. Program transformations modify the program execution order to enable optimizations. Communication optimizations reduce the number of messages and overlap communication with computation. Parallelism optimizations detect reductions and optimize pipelined computations to increase the amount of useful computation that may be performed in parallel. Empirical evaluations show that exploiting parallelism is vital, while message vectorization, coarse-grain pipelining, and collective communication are the key communication optimizations. A simple model is constructed to guide compiler optimizations. Loop indices, bounds, and nonlocal storage are managed by the compiler during code generation. Interprocedural analysis, optimization, and code generation algorithms limit compilation to only one pass over each procedure by collecting summary information after edits, then compiling procedures in reverse topological order to propagate necessary information. Delaying instantiation of the work partition, communication, and dynamic data decomposition enables interprocedural optimization. Interactions between the compiler and other elements of the programming system are discussed. Empirical measurements show that the output of the prototype Fortran D compiler is comparable to hand-written codes on the Intel iPSC/860 and significantly outperforms the CM Fortran compiler on the Thinking Machines CM-5.en_US
dc.format.extent208 ppen_US
dc.identifier.citationTseng, Chau-Wen. "An Optimizing Fortran D Compiler for MIMD Distributed-Memory Machines." (1993) https://hdl.handle.net/1911/96431.en_US
dc.identifier.digitalTR93-199en_US
dc.identifier.urihttps://hdl.handle.net/1911/96431en_US
dc.language.isoengen_US
dc.rightsYou are granted permission for the noncommercial reproduction, distribution, display, and performance of this technical report in any format, but this permission is only for a period of forty-five (45) days from the most recent time that you verified that this technical report is still available from the Computer Science Department of Rice University under terms that include this permission. All other rights are reserved by the author(s).en_US
dc.titleAn Optimizing Fortran D Compiler for MIMD Distributed-Memory Machinesen_US
dc.typeTechnical reporten_US
dc.type.dcmiTexten_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
TR93-199.pdf
Size:
13.45 MB
Format:
Adobe Portable Document Format