首页> 外文期刊>Parallel and Distributed Systems, IEEE Transactions on >Communication Optimization of Iterative Sparse Matrix-Vector Multiply on GPUs and FPGAs
【24h】

Communication Optimization of Iterative Sparse Matrix-Vector Multiply on GPUs and FPGAs

机译:GPU和FPGA上的迭代稀疏矩阵向量乘法的通信优化

获取原文
获取原文并翻译 | 示例
           

摘要

Trading communication with redundant computation can increase the silicon efficiency of FPGAs and GPUs in accelerating communication-bound sparse iterative solvers. While iterations of the iterative solver can be unrolled to provide reduction in communication cost, the extent of this unrolling depends on the underlying architecture, its memory model, and the growth in redundant computation. This paper presents a systematic procedure to select this algorithmic parameter , which provides communication-computation tradeoff on hardware accelerators like FPGA and GPU. We provide predictive models to understand this tradeoff and show how careful selection of can lead to performance improvement that otherwise demands significant increase in memory bandwidth. On an Nvidia C2050 GPU, we demonstrate a 1.9-42.6 speedup over standard iterative solvers for a range of benchmarks and that this speedup is limited by the growth in redundant computation. In contrast, for FPGAs, we present an architecture-aware algorithm that limits off-chip communication but allows communication between the processing cores. This reduces redundant computation and allows large and hence higher speedups. Our approach for FPGA provides a 0.3-4.4 speedup over same-generation GPU devices where is pic- ed carefully for both architectures for a range of benchmarks.
机译:使用冗余计算进行通信交换可以在加速通信绑定的稀疏迭代求解器时提高FPGA和GPU的芯片效率。尽管可以展开迭代求解器的迭代以降低通信成本,但展开的程度取决于基础体系结构,其内存模型以及冗余计算的增长。本文提出了一个选择该算法参数的系统程序,该程序在硬件加速器(如FPGA和GPU)上提供了通信计算权衡。我们提供了预测模型来理解这种折衷,并显示出谨慎选择如何会导致性能提高,否则需要显着增加内存带宽。在Nvidia C2050 GPU上,我们针对一系列基准测试证明了比标准迭代求解器高1.9-42.6的速度,并且该速度受冗余计算增长的限制。相反,对于FPGA,我们提出了一种体系结构感知算法,该算法限制片外通信,但允许处理内核之间进行通信。这减少了冗余计算,因此可以实现更大的加速。我们针对FPGA的方法比同代GPU器件提速0.3-4.4,对于两种体系结构都仔细记录了它们,以得出一系列基准。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号