Exploring the Potential for Accelerating Sparse Matrix-Vector Product on a Processing-in-Memory Architecture

Journal Title
Journal ISSN
Volume Title

As the importance of memory access delays on performance has mushroomed over the last few decades, researchers have begun exploring Processing-in-Memory (PIM) technology, which offers higher memory bandwidth, lower memory latency, and lower power consumption. In this study, we investigate whether an emerging PIM design from Sandia National Laboratories can boost performance for sparse matrix-vector product (SMVP). While SMVP is in the best-case bandwidth-bound, factors related to matrix structure and representation also limit performance. We analyze SMVP both in the context of an AMD Opteron processor and the Sandia PIM, exploring the performance limiters for each and the degree to which these can be ameliorated by data and code transformations. Over a range of sparse matrices, SMVP on the PIM outperformed the Opteron by a factor of 1.82. On the PIM, computational kernel and data structure transformations improved performance by almost 40% over conventional implementations using compressed-sparse row format.

This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/61946
Technical report

Youssefi, Annahita. "Exploring the Potential for Accelerating Sparse Matrix-Vector Product on a Processing-in-Memory Architecture." (2008) https://hdl.handle.net/1911/96374.

Forms part of
Published Version
You are granted permission for the noncommercial reproduction, distribution, display, and performance of this technical report in any format, but this permission is only for a period of forty-five (45) days from the most recent time that you verified that this technical report is still available from the Computer Science Department of Rice University under terms that include this permission. All other rights are reserved by the author(s).
Link to license
Citable link to this page