首页> 中文期刊> 《计算机应用》 >基于MPI+CUDA异步模型的并行矩阵乘法

基于MPI+CUDA异步模型的并行矩阵乘法

         

摘要

Matrix multiplication plays an important role in scientific computing. Different structural models can improve the performance of parallel matrix multiplication. In the existing MPI + CUDA synchronization model, the host-side need enter the waiting state and cannot continue to work until the device completes the task, which obviously wastes time. Concerning this question, a parallel matrix multiplication based on MPI + CUDA asynchronous model was proposed. This model prevented host-side's entering into the waiting state, and used CUDA-stream technology to solve the problem of data bulk over GPU memory. By analyzing the speedup ratio and efficiency of the asynchronous model, the experimental results show that MPI + CUDA parallel programming obviously promotes parallel efficiency and large-scale matrix multiplication' s speed, which exerts the advantages of the distributional memory between the nodes and the share memory in the node. It is an effective and feasible parallel strategy.%矩阵乘法在科学计算领域中起着重要的作用,不同结构模型能够改善并行矩阵乘的性能.现有的MPI+CUDA同步模型中,主机端需要进入等待状态,直到设备端完成任务后才能继续工作,这显然浪费时间.针对上述问题,提出一种基于MPI+ CUDA异步模型的并行矩阵乘法.该模型避免了主机端进入等待状态,并采用CUDA流技术解决数据量超过GPU内存问题.通过分析异步模型的加速比和效率,实验结果表明,此方法显著提高了并行效率和大型矩阵乘法的运算速度,充分发挥了节点间分布式存储和节点内共享内存的优势,是一种有效可行的并行策略.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号