Mellor-Crummey, John2017-08-022017-08-022016-052016-02-12May 2016Paul, Sri Raj. "Performance Analysis and Optimization of a Hybrid Distributed Reverse Time Migration Application." (2016) Master’s Thesis, Rice University. <a href="https://hdl.handle.net/1911/96190">https://hdl.handle.net/1911/96190</a>.https://hdl.handle.net/1911/96190To fully exploit emerging processor architectures, programs will need to employ threaded parallelism within a node and message passing across nodes. Today, MPI+OpenMP is the preferred programming model for this task. However, tuning MPI+OpenMP programs for clusters is difficult. Performance tools can help users identify bottlenecks and uncover opportunities for improvement. Applications to analyze seismic data employ scalable parallel systems to produce timely results. This thesis describes our experiences of applying performance tools to gain insight into an MPI+OpenMP code that performs Reverse Time Migration (RTM) to analyze seismic data and also assess the capabilities of available tools for analyzing the performance of a sophisticated application that employ both message-passing and threaded parallelism. The tools provided us with insights into the effectiveness of the domain decomposition strategy, the use of threaded parallelism, and functional unit utilization in individual cores. By applying insights obtained from Rice University's HPCToolkit and hardware performance counters, we were able to improve the performance of the RTM code by roughly 30 percent.application/pdfengCopyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.Performance Analysis toolsReverse time migrationHybrid programming modelMPI+OpenMPPerformance Analysis and Optimization of a Hybrid Distributed Reverse Time Migration ApplicationThesis2017-08-02