Browsing by Author "Liu, Xu"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
Item Performance Analysis of Program Executions on Modern Parallel Architectures(2014-07-25) Liu, Xu; Mellor-Crummey, John; Sarkar, Vivek; Varman, Peter; Browne, JamesParallel architectures have become common in supercomputers, data centers, and mobile chips. Usually, parallel architectures have complex features: many hardware threads, deep memory hierarchies, and non-uniform memory access (NUMA). Program designs without careful consideration of these features may lead to poor performance on such architectures. First, multi-threaded programs can suffer from performance degradation caused by imbalanced workload, overuse of synchronization, and parallel overhead. Second, parallel programs may suffer from the long latency to the main memory. Third, in a NUMA system, memory accesses can be remote rather than local. Without a NUMA-aware design, a threaded program may have many costly remote accesses and imbalanced memory requests to NUMA domains. Performance tools can help us take full advantage of the power of parallel architectures by providing insight into where and why a program fails to obtain top performance. This dissertation addresses the difficulty of obtaining insights about performance bottlenecks in parallel programs using lightweight measurement techniques. This dissertation makes four contributions. First, it describes a novel performance analysis method for OpenMP programs, which can identify root causes of performance losses. Second, it presents a data-centric analysis method that associates performance metrics with data objects. This data-centric analysis can both identify both a program's problematic memory accesses and associated variables; this information can help an application developer optimize programs for better locality. Third, this dissertation discusses the development of a lightweight method that collects memory reuse distance to guide cache locality optimization. Finally, it describes implemented a lightweight profiling method that can help pinpoint performance losses in programs on NUMA architectures and provide guidance about how to transform the program to improve performance. To validate the utility of these methods, I implemented them in HPCToolkit, a state-of-the-art profiler developed at Rice University. I used the extended HPCToolkit to study several parallel programs. Guided by the performance insights provided by the new techniques introduced in this dissertation, I optimized all of these programs and was able to obtain non-trivial improvements to their performance. The measurement overhead incurred by these new analysis methods is very small in both runtime and memory.Item UBER v1.0: a universal kinetic equation solver for radiation belts(European Geosciences Union, 2021) Zheng, Liheng; Chen, Lunjin; Chan, Anthony A.; Wang, Peng; Xia, Zhiyang; Liu, XuRecent proceedings in radiation belt studies have proposed new requirements for numerical methods to solve the kinetic equations involved. In this article, we present a numerical solver that can solve the general form of the radiation belt Fokker–Planck equation and Boltzmann equation in arbitrarily provided coordinate systems and with user-specified boundary geometry, boundary conditions, and equation terms. The solver is based upon the mathematical theory of stochastic differential equations, whose computational accuracy and efficiency are greatly enhanced by specially designed adaptive algorithms and a variance reduction technique. The versatility and robustness of the solver are exhibited in four example problems. The solver applies to a wide spectrum of radiation belt modeling problems, including the ones featuring non-diffusive particle transport such as that arising from nonlinear wave–particle interactions.