Browsing by Author "Dotsenko, Yuri"
Now showing 1 - 4 of 4
Results Per Page
Sort Options
Item Expressiveness, Programmability and Portable High Performance of Global Address Space Languages(2007-01-30) Dotsenko, YuriThe Message Passing Interface (MPI) is the library-based programming model employed by most scalable parallel applications today; however, it is not easy to use. To simplify program development, Partitioned Global Address Space (PGAS) languages have emerged as promising alternatives to MPI. Co-array Fortran (CAF), Titanium, and Unified Parallel C are explicitly parallel single-program multiple-data languages that provide the abstraction of a global shared memory and enable programmers to use one-sided communication to access remote data. This thesis focuses on evaluating PGAS languages and explores new language features to simplify the development of high performance programs in CAF. To simplify program development, we explore extending CAF with abstractions for group, Cartesian, and graph communication topologies that we call co-spaces. The combination of co-spaces, textual barriers, and single values enables effective analysis and optimization of CAF programs. We present an algorithm for synchronization strength reduction (SSR), which replaces textual barriers with faster point-to-point synchronization. This optimization is both difficult and error-prone for developers to perform manually. SSR-optimized versions of Jacobi iteration and the NAS MG and CG benchmarks yield performance similar to that of our best hand-optimized variants and demonstrate significant improvement over their barrier-based counterparts. To simplify the development of codes that rely on producer-consumer communication, we explore extending CAF with multi-version variables (MVVs). MVVs increase programmer productivity by insulating application developers from the details of buffer management, communication, and synchronization. Sweep3D, NAS BT, and NAS SP codes expressed using MVVs are much simpler than the fastest hand-coded variants, and experiments show that they yield similar performance. To avoid exposing latency in distributed memory systems, we explore extending CAF with distributed multithreading (DMT) based on the concept of function shipping. Function shipping facilitates co-locating computation with data as well as executing several asynchronous activities in the remote and local memory. DMT uses co-subroutines/cofunctions to ship computation with either blocking or non-blocking semantics. A prototype implementation and experiments show that DMT simplifies development of parallel search algorithms and the performance of DMT-based Random Access exceeds that of the reference MPI implementation.Item Expressiveness, programmability and portable high performance of global address space languages(2007) Dotsenko, Yuri; Mellor-Crummey, JohnThe Message Passing Interface (MPI) is the library-based programming model employed by most scalable parallel applications today; however, it is not easy to use. To simplify program development, Partitioned Global Address Space (PGAS) languages have emerged as promising alternatives to MPI. Co-array Fortran (CAF), Titanium, and Unified Parallel C are explicitly parallel single-program multiple-data languages that provide the abstraction of a global shared memory and enable programmers to use one-sided communication to access remote data. This thesis focuses on evaluating PGAS languages and explores new language features to simplify the development of high performance programs in CAF. To simplify program development, we explore extending CAF with abstractions for group, Cartesian, and graph communication topologies that we call co-spaces. The combination of co-spaces, textual barriers, and single values enables effective analysis and optimization of CAF programs. We present an algorithm for synchronization strength reduction (SSR), which replaces textual barriers with faster point-to-point synchronization. This optimization is both difficult and error-prone for developers to perform manually. SSR-optimized versions of Jacobi iteration and the NAS MG and CG benchmarks yield performance similar to that of our best hand-optimized variants and demonstrate significant improvement over their barrier-based counterparts. To simplify the development of codes that rely on producer-consumer communication, we explore extending CAF with multi-version variables (MVVs). MVVs increase programmer productivity by insulating application developers from the details of buffer management, communication, and synchronization. Sweep3D, NAS BT, and NAS SP codes expressed using MVVs are much simpler than the fastest hand-coded variants, and experiments show that they yield similar performance. To avoid exposing latency in distributed memory systems, we explore extending CAF with distributed multithreading (DMT) based on the concept of function shipping. Function shipping facilitates co-locating computation with data as well as executing several asynchronous activities in the remote and local memory. DMT uses co-subroutines/cofunctions to ship computation with either blocking or non-blocking semantics. A prototype implementation and experiments show that DMT simplifies development of parallel search algorithms and the performance of DMT-based RandomAccess exceeds that of the reference MPI implementation.Item Extensible Adaptation via Constraint Solving(2002-04-30) Dotsenko, YuriThis work presents the design, implementation, and evaluation of a simple programming language for expressing scheduling policies for transmission of multiple objects across a shared network connection. A key design component of the language is the ability to express constraints among the objects to be transmitted. Policies can: make ordering constraints, such as "all text objects are transmitted before any image objects''; express rules on the relative bandwidth allocations across objects of different types; reserve a certain amount of bandwidth or restrict transmission of a subset of objects. Because it is possible to express contradictory constraints, the system finds suitable approximate solutions when no precise solution is available.Item Extensible adaptation via constraint solving(2002) Dotsenko, Yuri; Wallach, Dan S.This work presents the design, implementation, and evaluation of a simple programming language for expressing scheduling policies for transmission of multiple objects across a shared network connection. A key design component of the language is the ability to express constraints among the objects to be transmitted. Policies can: make ordering constraints, such as "all text objects are transmitted before any image objects"; express rules on the relative bandwidth allocations across objects of different types; reserve a certain amount of bandwidth or restrict transmission of a subset of objects. Because it is possible to express contradictory constraints, the system finds suitable approximate solutions when no precise solution is available.