Typed Fusion with Applications to Parallel and Sequential Code Generation
Loop fusion is a program transformation that merges multiple loops into one and is an effective optimization both for increasing the granularity of parallel loops and for improving data locality. This paper introduces typed fusion, a formulation of loop fusion which captures the fusion and distribution problems encountered in sequential and parallel program optimization. Typed fusion is more general and applicable than previous work. We present a fast algorithm for a typed fusion on a graph G = (N; E), where nodes represent loops, edges represent dependence constraints between loops and each loop is assigned one of T distinct types. Only nodes of the same type may fuse. Only nodes of the same type may be fused. The asymptotic time bound for this algorithm is O((N + E)T). The fastest previous algorithm considered only one or two types, but was still O(NE) [KM93]. When T > 2 and there is no reason to prefer fusing one type over another, we prove the problem of finding a fusion with the fewest resultant loops to be NP-hard. Using typed fusion, we present fusion and distribution algorithms that improve data locality and a parallel code generation algorithm that incorporates compound transformations. We also give evidence of the effectiveness of this algorithm in practice.
Kennedy, Ken and McKinley, Kathryn S.. "Typed Fusion with Applications to Parallel and Sequential Code Generation." (1994) https://hdl.handle.net/1911/96439.