Repository logo
English
  • English
  • Català
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Italiano
  • Latviešu
  • Magyar
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Tiếng Việt
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Yкраї́нська
  • Log In
    or
    New user? Click here to register.Have you forgotten your password?
Repository logo
  • Communities & Collections
  • All of R-3
English
  • English
  • Català
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Italiano
  • Latviešu
  • Magyar
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Tiếng Việt
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Yкраї́нська
  • Log In
    or
    New user? Click here to register.Have you forgotten your password?
  1. Home
  2. Browse by Author

Browsing by Author "Lu, Honghui"

Now showing 1 - 3 of 3
Results Per Page
Sort Options
  • Loading...
    Thumbnail Image
    Item
    Message passing versus distributed shared memory on networks of workstations
    (1995) Lu, Honghui; Zwaenepoel, Willy
    We compared the message passing library Parallel Virtual Machine (PVM) with the distributed shared memory system TreadMarks, on networks of workstations. We presented the performance of nine applications, including Water and Barnes-Hut from the SPLASH benchmarks; 3-D FFT, Integer Sort and Embarrassingly Parallel from the NAS benchmarks; ILINK, a widely used genetic analysis program; and SOR, TSP, and QuickSort. TreadMarks performed nearly identical to PVM on computation bound programs, such as the Water simulation of 1728 molecules. For most of the other applications, including ILINK, TreadMarks performed within 75% of PVM with 8 processes. The separation of synchronization and data transfer, and additional messages to request updates for data in the invalidate-based shared-memory protocol were two of the reasons for TreadMarks's lower performance. TreadMarks also suffered from extra data communication due to false sharing. Moreover, PVM benefited from the ability to aggregate scattered data in a single message.
  • Loading...
    Thumbnail Image
    Item
    OpenMP on Networks of Workstations
    (2001-01-30) Lu, Honghui
    The OpenMP Application Programming Interface (API) is an emerging standard for parallel programming on shared memory architectures. Networks of workstations (NOWs) are attractive parallel programming platforms because of their good price/performance ratio, as well as their and their potential to scale. This work is the first to extend the support for OpenMP to networks of workstations. The design is based on integrating the compiler and the run-time system. In the combined system, the run-time library remains the basic vehicle for implementing shared memory, while the compiler performs optimization rather than implementation. The integrated approach can effectively optimize irregular applications, for which an exact compile-time analysis is not possible. One optimization aggregates messages for irregular applications based on the data access information provided by the compiler. Another optimization improves the scalability of the system by computation replication and multicast. The integrated system also simplifies the run-time implementation. In the implementation of OpenMP on networks of shared-memory multiprocessors, the compile-time information greatly reduces the number of changes required to the run-time system in order to exploit the intra-node hardware shared memory. In another implementation that allows OpenMP programs to change the number of computing nodes during the execution, the OpenMP semantics provides convenient points for efficient adaptation.
  • Loading...
    Thumbnail Image
    Item
    OpenMP on networks of workstations
    (2001) Lu, Honghui; Zwaenepoel, Willy
    The OpenMP Application Programming Interface (API) is an emerging standard for parallel programming on shared memory architectures. Networks of workstations (NOWs) are attractive parallel programming platforms because of their good price/performance ratio, as well as their flexibility and their potential to scale. This work is the first to extend the support for OpenMP to networks of workstations. The design is based on integrating the compiler and the run-time system. In the combined system, the run-time library remains the basic vehicle for implementing shared memory, while the compiler performs optimization rather than implementation. The integrated approach can effectively optimize irregular applications, for which an exact compile-time analysis is not possible. One optimization aggregates messages for irregular applications based on the data access information provided by the compiler. Another optimization improves the scalability of the system by computation replication and multicast. The integrated system also simplifies the run-time implementation. In the implementation of OpenMP on networks of shared-memory multiprocessors, the compile-time information greatly reduces the number of changes required to the run-time system in order to exploit the intra-node hardware shared memory. In another implementation that allows OpenMP programs to change the number of computing nodes during the execution, the OpenMP semantics provides convenient points for efficient adaptation.
  • About R-3
  • Report a Digital Accessibility Issue
  • Request Accessible Formats
  • Fondren Library
  • Contact Us
  • FAQ
  • Privacy Notice
  • R-3 Policies

Physical Address:

6100 Main Street, Houston, Texas 77005

Mailing Address:

MS-44, P.O.BOX 1892, Houston, Texas 77251-1892