Adhianto, L.Banerjee, S.Fagan, M.Krentel, M.Marin, G.Mellor-Crummey, J.Tallent, N. R.2017-08-022017-08-022008-10-03Adhianto, L., Banerjee, S., Fagan, M., et al.. "HPCTOOLKIT: Tools for performance analysis of optimized parallel programs." (2008) is an integrated suite of tools that supports measurement, analysis, attribution, and presentation of application performance for both sequential and parallel programs. HPCTOOLKIT can pinpoint and quantify scalability bottlenecks in fully-optimized parallel programs with a measurement overhead of only a few percent. Recently, new capabilities were added to HPCTOOLKIT for collecting call path profiles for fully-optimized codes without any compiler support, pinpointing and quantifying bottlenecks in multithreaded programs, exploring performance information and source code using a new user interface, and displaying hierarchical space-time diagrams based on traces of asynchronous call stack samples. This paper provides an overview of HPCTOOLKIT and illustrates its utility for performance analysis of parallel applications.16 ppengYou are granted permission for the noncommercial reproduction, distribution, display, and performance of this technical report in any format, but this permission is only for a period of forty-five (45) days from the most recent time that you verified that this technical report is still available from the Computer Science Department of Rice University under terms that include this permission. All other rights are reserved by the author(s).HPCTOOLKIT: Tools for performance analysis of optimized parallel programsTechnical reportTR08-06