首页> 外文会议>IEEE International Symposium on Computer Architecture and High Performance Computing >STOMP: Statistical Techniques for Optimizing and Modeling Performance of Blocked Sparse Matrix Vector Multiplication
【24h】

STOMP: Statistical Techniques for Optimizing and Modeling Performance of Blocked Sparse Matrix Vector Multiplication

机译:STOMP:统计技术,用于优化和建模块状稀疏矩阵向量乘法的性能

获取原文

摘要

Sparse-matrix vector multiplication (SpMV) is the core compute routine for several scientific and commercial codebases. Because of its extremely irregular memory accesses (low temporal locality), indirect memory referencing (low spatial locality), low arithmetic intensity, and the non-zero pattern and non-zero density of the matrix, SpMV achieves a mere 10% of peak system performance. Because sparse matrices have extremely varied non-zero patterns and densities, performance of SpMV is hard to predict. Blocking sparse matrices increases arithmetic intensity and spatial locality during SpMV operations, thereby improving SpMV performance. However, selection of an incorrect block size can produce performance degradation as high as 70%. In this study, we describe the STOMP approach of using statistical techniques to predict run time of SpMV in PETSc for new matrices with mean accuracy of 93.52%. We use these statistical prediction models to guide block size selection to achieve up to 100% of optimal performance, comparable to that attained through exhaustive block size search. Our block size selection results produce an average of 55.56% speedup over default SpMV options. On the same set of matrices used in the SPARSITY SpMV framework, STOMP yields a 54.46% speedup while SPARSITY yields a 31.62% speedup over the same default.
机译:稀疏矩阵向量乘法(SpMV)是几种科学和商业代码库的核心计算例程。由于其极不规则的内存访问(低时间局部性),间接内存引用(低空间局部性),低算术强度以及矩阵的非零模式和非零密度,SpMV仅实现了峰值系统的10%表现。由于稀疏矩阵具有非常变化的非零模式和密度,因此很难预测SpMV的性能。阻塞稀疏矩阵会增加SpMV操作期间的算术强度和空间局部性,从而提高SpMV性能。但是,选择不正确的块大小会导致性能下降高达70%。在这项研究中,我们描述了使用统计技术预测PETSc中SpMV在新基质中的运行时间的STOMP方法,平均准确度为93.52%。我们使用这些统计预测模型来指导块大小选择,以达到100%的最佳性能,与通过穷举块大小搜索获得的性能相当。我们的块大小选择结果比默认SpMV选项平均提高了55.56%。在SPARSITY SpMV框架中使用的同一组矩阵上,与相同的默认值相比,STOMP产生54.46%的加速,而SPARSITY产生31.62%的加速。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号