Mapping a Dataflow Programming Model onto Heterogeneous Architectures

Sbirlea, Alina

Mapping a Dataflow Programming Model onto Heterogeneous Architectures

dc.contributor.advisor	Sarkar, Vivek	en_US
dc.contributor.committeeMember	Cooper, Keith D.	en_US
dc.contributor.committeeMember	Mellor-Crummey, John	en_US
dc.contributor.committeeMember	Budimlic, Zoran	en_US
dc.creator	Sbirlea, Alina	en_US
dc.date.accessioned	2012-09-06T00:05:44Z	en_US
dc.date.accessioned	2012-09-06T00:05:47Z	en_US
dc.date.available	2012-09-06T00:05:44Z	en_US
dc.date.available	2012-09-06T00:05:47Z	en_US
dc.date.created	2012-05	en_US
dc.date.issued	2012-09-05	en_US
dc.date.submitted	May 2012	en_US
dc.date.updated	2012-09-06T00:05:48Z	en_US
dc.description.abstract	This thesis describes and evaluates how extending Intel's Concurrent Collections (CnC) programming model can address the problem of hybrid programming with high performance and low energy consumption, while retaining the ease of use of data-flow programming. The CnC model is a declarative, dynamic light-weight task based parallel programming model and is implicitly deterministic by enforcing the single assignment rule, properties which ensure that problems are modelled in an intuitive way. CnC offers a separation of concerns by allowing algorithms to be expressed as a two stage process: first by decomposing a problem into components and specifying how components interact with each other, and second by providing an implementation for each component. By facilitating the separation between a domain expert, who can provide an accurate problem specification at a high level, and a tuning expert, who can tune the individual components for better performance, we ensure that tuning and future development, such as replacement of a subcomponent with a more efficient algorithm, become straightforward. A recent trend in mainstream desktop systems is the use of graphics processor units (GPUs) to obtain order-of-magnitude performance improvements relative to general-purpose CPUs. In addition, the use of FPGAs has seen a significant increase for applications that can take advantage of such dedicated hardware. We see that computing is evolving from using many core CPUs to ``co-processing" on the CPU, GPU and FPGA, however hybrid programming models that support the interaction between multiple heterogeneous components are not widely accessible to mainstream programmers and domain experts who have a real need for such resources. We propose a C-based implementation of the CnC model for enabling parallelism across heterogeneous processor components in a flexible way, with high resource utilization and high programmability. We use the task-parallel HabaneroC language (HC) as the platform for implementing CnC-HabaneroC (CnC-HC), a language also used to implement the computation steps in CnC-HC, for interaction with GPU or FPGA steps and which offers the desired flexibility and extensibility of interacting with any other C based language. First, we extend the CnC model with tag functions and ranges to enable automatic code generation of high level operations for inter-task communication. This improves programmability and also makes the code more analysable, opening the door for future optimizations. Secondly, we introduce a way to specify steps that are data parallel and thus are fit to execute on the GPU, and the notion of task affinity, a tuning annotation in the specification language. Affinity is used by the runtime during scheduling and can be fine-tuned based on application needs to achieve better (faster, lower power, etc.) results. Thirdly, we introduce and develop a novel, data-driven runtime for the CnC model, using HabaneroC (HC) as a base language. In addition, we also create an implementation of the previous runtime approach and conduct a study to compare the performance. Next, we expand the HabaneroC dynamic work-stealing runtime to allow cross-device stealing based on task affinity. Cross-device dynamic work-stealing is used to achieve load balancing across heterogeneous platforms for improved performance. Finally, we implement and use a series of benchmarks for testing the model in different scenarios and show that our proposed approach can yield significant performance benefits and low power usage when using a hybrid execution.	en_US
dc.format.mimetype	application/pdf	en_US
dc.identifier.citation	Sbirlea, Alina. "Mapping a Dataflow Programming Model onto Heterogeneous Architectures." (2012) Master’s Thesis, Rice University. <a href="https://hdl.handle.net/1911/64634">https://hdl.handle.net/1911/64634</a>.	en_US
dc.identifier.slug	123456789/ETD-2012-05-83	en_US
dc.identifier.uri	https://hdl.handle.net/1911/64634	en_US
dc.language.iso	eng	en_US
dc.rights	Copyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.	en_US
dc.subject	Data flow model	en_US
dc.subject	Heterogeneous architectures	en_US
dc.subject	Domain specific language	en_US
dc.subject	Tuning annotations	en_US
dc.title	Mapping a Dataflow Programming Model onto Heterogeneous Architectures	en_US
dc.type	Thesis	en_US
dc.type.material	Text	en_US
thesis.degree.department	Computer Science	en_US
thesis.degree.discipline	Engineering	en_US
thesis.degree.grantor	Rice University	en_US
thesis.degree.level	Masters	en_US
thesis.degree.name	Master of Science	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: SBIRLEA-THESIS.pdf
Size:: 1.89 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.61 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Rice University Theses and Dissertations