...
首页> 外文期刊>Future generation computer systems >ALBUS: A method for efficiently processing SpMV using SIMD and Load balancing
【24h】

ALBUS: A method for efficiently processing SpMV using SIMD and Load balancing

机译:ALBUS:一种使用SIMD和负载平衡有效处理SPMV的方法

获取原文
获取原文并翻译 | 示例
           

摘要

SpMV (Sparse matrix-vector multiplication) is widely used in many fields. Improving the performance of SpMV has been the pursuit of many researchers. Parallel SpMV using multi-core processors has been a standard parallel method used by researchers. In reality, the number of non-zero elements in many sparse matrices is not evenly distributed, so parallelism without preprocessing will cause a large amount of performance loss due to uneven load. In this paper, we propose ALBUS (Absolute Load Balancing Using SIMD (Single Instruction Multiple Data)), a method for efficiently processing SpMV using load balancing and SIMD vectorization. On the one hand, ALBUS can achieve multi-core balanced load processing; on the other hand, it gives full play to the ability of SIMD vectorization parallelism under the CPU. We selected 20 sets of regular matrices and 20 sets of irregular matrices to form the Benchmark suite. We performed SpMV performance comparison tests on ALBUS, CSR5 (Compressed Sparse Row 5), Merge(Merge-based SpMV), and MKL (Math Kernel Library) under the same conditions. On the E5-2670 v3 CPU platform, For 20 sets of regular matrices, ALBUS can achieve an average speedup of 1.59×, 1.32×, 1.48× (up to 2.53×, 2.22×, 2.31×) compared to CSR5, Merge, MKL, respectively. For 20 sets of irregular matrices, ALBUS can achieve an average speedup of 1.38×, 1.42×, 2.44× (up to 2.33×, 2.24×, 5.37×) compared to CSR5, Merge, MKL, respectively.
机译:SPMV(稀疏矩阵矢量乘法)广泛用于许多领域。提高SPMV的表现一直是许多研究人员的追求。使用多核处理器的并行SPMV是研究人员使用的标准并行方法。实际上,许多稀疏矩阵中的非零元素的数量没有均匀分布,因此在没有预处理的情况下并行性将导致由于不均匀负载而导致大量的性能损失。在本文中,我们提出了Albus(使用SIMD的绝对负载平衡(单指令多数据)),一种使用负载平衡和SIMD矢量化有效地处理SPMV的方法。一方面,Albus可以实现多核平衡负载处理;另一方面,它充分发挥CPU下SIMD矢量化并行性的能力。我们选择了20套常规矩阵和20套不规则矩阵以形成基准套件。我们在相同条件下对Albus,CSR5(压缩稀疏行5),合并(合并的SPMV)和MKL(Math Kernel Library)进行SPMV性能比较测试。在E5-2670 V3 CPU平台上,对于20套常规矩阵,Albus可以实现1.59倍,1.32×,1.48×(高达2.53×,2.22×,2.31×)的平均加速度与CSR5,合并,MKL相比, 分别。对于20套不规则矩阵,与CSR5,合并MKL相比,Albus可以达到1.38倍,1.42倍,2.44×(高达2.33×,2.24×,5.37×)的平均速度。

著录项

  • 来源
    《Future generation computer systems》 |2021年第3期|371-392|共22页
  • 作者单位

    Department of Computer Technology and Application (HDACP) Qinghai University Xining China;

    Department of Computer Technology and Application (HDACP) Qinghai University Xining China Department of Computer Science and Technology (PACMAN) Tsinghua University Beijing China;

    Department of Computer Technology and Application (HDACP) Qinghai University Xining China;

    Department of Computer Technology and Application (HDACP) Qinghai University Xining China;

    Department of Computer Technology and Application (HDACP) Qinghai University Xining China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    SpMV; ALBUS; CSR5; MKL; SIMD; Load balancing;

    机译:尖刺;albus;5。 mukul;Sindh;加强负荷;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号