首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >An Extended Compression Format for the Optimization of Sparse Matrix-Vector Multiplication
【24h】

An Extended Compression Format for the Optimization of Sparse Matrix-Vector Multiplication

机译:稀疏矩阵向量乘法优化的扩展压缩格式

获取原文
获取原文并翻译 | 示例
           

摘要

Sparse matrix-vector multiplication ($({rm SpM}times{rm V})$) has been characterized as one of the most significant computational scientific kernels. The key algorithmic characteristic of the $({rm SpM}times{rm V})$ kernel, that inhibits it from achieving high performance, is its very low flop:byte ratio. In this paper, we present a compressed storage format, called Compressed Sparse eXtended (CSX), that is able to detect and encode simultaneously multiple commonly encountered substructures inside a sparse matrix. Relying on aggressive compression techniques of the sparse matrix's indexing structure, CSX is able to considerably reduce the memory footprint of a sparse matrix, alleviating the pressure to the memory subsystem. In a diverse set of sparse matrices, CSX was able to provide a more than 40 percent average performance improvement over the standard CSR format in SMP architectures and surpassed 20 percent improvement in NUMA systems, significantly outperforming other CSR alternatives. Additionally, it was able to adapt successfully to the nonzero element structure of the considered matrices, exhibiting very stable performance. Finally, in the context of a "real-lifeâ multiphysics simulation software, CSX accelerated the $({rm SpM}times{rm V})$ component nearly 40 percent and the total solver time approximately 15 percent.
机译:稀疏矩阵向量乘法($({rm SpM} times {rm V})$)被描述为最重要的计算科学内核之一。 $({rm SpM} times {rm V})$内核的关键算法特征是其极低的flop:byte ratio,这阻碍了它获得高性能。在本文中,我们提出了一种压缩存储格式,称为压缩稀疏扩展(CSX),它能够同时检测和编码稀疏矩阵内部的多个常见子结构。依靠稀疏矩阵索引结构的主动压缩技术,CSX能够显着减少稀疏矩阵的内存占用量,从而减轻对内存子系统的压力。在各种稀疏矩阵中,CSX能够比SMP体系结构中的标准CSR格式提供40%以上的平均性能提高,而NUMA系统中的性能提高超过20%,明显优于其他CSR替代方案。此外,它能够成功地适应所考虑矩阵的非零元素结构,表现出非常稳定的性能。最后,在“现实生活”多物理场仿真软件的上下文中,CSX将$({rm SpM} times {rm V})$组件的速度提高了近40%,求解器的总时间缩短了约15%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号