Repository logo
English
  • English
  • Català
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Italiano
  • Latviešu
  • Magyar
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Tiếng Việt
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Yкраї́нська
  • Log In
    or
    New user? Click here to register.Have you forgotten your password?
Repository logo
  • Communities & Collections
  • All of R-3
English
  • English
  • Català
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Italiano
  • Latviešu
  • Magyar
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Tiếng Việt
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Yкраї́нська
  • Log In
    or
    New user? Click here to register.Have you forgotten your password?
  1. Home
  2. Browse by Author

Browsing by Author "Abdel-Shafi, Hazim M."

Now showing 1 - 2 of 2
Results Per Page
Sort Options
  • Loading...
    Thumbnail Image
    Item
    Fine-grain producer-initiated communication in cache-coherent multiprocessors
    (1997) Abdel-Shafi, Hazim M.; Adve, Sarita V.
    Shared-memory multiprocessors are becoming increasingly popular as a high-performance, easy to program, and relatively inexpensive choice for parallel computation. However, the performance of shared-memory multiprocessors is limited by memory latency. Memory latencies are higher in multiprocessors due to physical constraints and cache coherence overheads. In addition, synchronization operations, which are necessary to ensure correctness in parallel programs, add further communication overhead in shared-memory multiprocessors. Software-controlled non-binding data prefetching is a widely used consumer-initiated mechanism to hide communication latency and is currently supported on most architectures. However, on an invalidation-based cache-coherent multiprocessor, prefetching is inapplicable or insufficient for some communication patterns such as irregular communication, fine-grain pipelined loops, and synchronization. For these cases, a combination of two fine-grain, producer-initiated primitives (referred to as remote writes) is better able to reduce the latency of communication. This work demonstrates experimentally that remote writes provide significant performance benefits in cache-coherent shared-memory multiprocessors both with and without prefetching. Further, the combination of remote writes and prefetching is able to eliminate most of the memory system overheads in our applications, except for misses due to cache conflicts.
  • Loading...
    Thumbnail Image
    Item
    Reliable parallel computing on clusters of multiprocessors
    (2000) Abdel-Shafi, Hazim M.; Bennett, John K.
    This dissertation describes the design, implementation, and performance of two mechanisms that address reliability and system management problems associated with parallel computing clusters: thread migration and checkpoint/recovery. A unique aspect of this work is the integration of these two mechanisms. Although there has been considerable prior work on each of these mechanisms in isolation, their integration offers synergistic benefit to both functionality and performance. Used in, conjunction, these mechanisms facilitate failure recovery, and node addition and removal with minimal disruption of executing applications. Our implementation differs from previous work in the following ways. First, by using thread migration instead of process migration, the overhead of moving computation among nodes is reduced. Second, because our implementation of checkpoint/recovery separates computation and data, it is possible to distribute data and threads among other nodes during recovery. This is possible because the underlying support for thread migration in the system allows the recovery of a thread from any checkpoint on any node. Third, our implementation does not require repartitioning of a running parallel application when resources are added or removed. Finally, the checkpoint/recovery and thread migration mechanisms are both implemented at user-level. The benefits of a user-level implementation include ease of development since operating system source code is not required, adaptability to other platforms, and simple upgrades to new versions of the underlying operating system and hardware. The prototype implementation described in this thesis was developed as an extension to the Brazos software distributed shared memory system. Brazos allows multithreaded parallel applications to execute on networks of multiprocessor servers running the Windows NT/2000 operating system.
  • About R-3
  • Report a Digital Accessibility Issue
  • Request Accessible Formats
  • Fondren Library
  • Contact Us
  • FAQ
  • Privacy Notice
  • R-3 Policies

Physical Address:

6100 Main Street, Houston, Texas 77005

Mailing Address:

MS-44, P.O.BOX 1892, Houston, Texas 77251-1892