Resource Constrained Loop Fusion

Ding, Chen; Kennedy, Ken

Resource Constrained Loop Fusion

Files

TR03-424.pdf (77.77 KB)

Date

2003-09-03

Authors

Ding, Chen

Kennedy, Ken

Abstract

Embedded processors have limited on-chip memory. Fusing loops that use the same data can reduce the distance between accesses to the same memory location, avoiding costly off-chip memory transfer. Most existing greedy fusion algorithms solve the unconstrained problem—they do not guard against negative effects of excessive fusion. When a large program contains a great number of loops, unconstrained fusion may generate huge loops that overflow on-chip memory, leading to lower performance. This paper studies the problem for on strained weighted fusion, in which the graph edges carry weights indicating the profitability of fusing the inputs and vertices are annotated with resource requirements. The optimal solution of a constrained weighted fusion problem is a collection of vertex sets such that the total weight associated with pairs of vertices within clusters is maximized and the aggregate resource requirement of every cluster is less than a fixed upper bound R. Finding the optimal solution to a weighted fusion problem (constrained or unconstrained) is P-complete, so we use heuristics. We present two methods. The first picks a group of loops at each fusion step. To ease the resource calculation and fusibility test, the second method picks only a pair of candidate loops at each step. The paper presents the two algorithms, their complexity, and an experimental evaluation.

Type

Technical report

Citation

Ding, Chen and Kennedy, Ken. "Resource Constrained Loop Fusion." (2003) https://hdl.handle.net/1911/96319.

Rights

You are granted permission for the noncommercial reproduction, distribution, display, and performance of this technical report in any format, but this permission is only for a period of forty-five (45) days from the most recent time that you verified that this technical report is still available from the Computer Science Department of Rice University under terms that include this permission. All other rights are reserved by the author(s).

Citable link to this page

https://hdl.handle.net/1911/96319

Collections

Computer Science Technical Reports

Full item page