首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >A Dynamic Approach for Workload Partitioning on GPU Architectures
【24h】

A Dynamic Approach for Workload Partitioning on GPU Architectures

机译:在GPU架构上进行工作负载分区的动态方法

获取原文
获取原文并翻译 | 示例
           

摘要

Workload partitioning and the subsequent work item-to-thread mapping are key aspects to face when implementing any efficient GPU application. Different techniques have been proposed to deal with such issues, ranging from the computationally simplest static to the most complex dynamic ones. Each of them finds the best use depending on the workload characteristics (static for more regular workloads, dynamic for irregular workloads). Nevertheless, no one of them provides a sound tradeoff when applied in both cases. Static approaches lead to load unbalancing with irregular problems, while the computational overhead introduced by the dynamic or semi-dynamic approaches often worsens the overall application performance when run on regular problems. This article presents an efficient dynamic technique for workload partitioning and work item-to-thread mapping whose complexity is significantly reduced with respect to the other dynamic approaches in literature. The article shows how the partitioning and mapping algorithm has been implemented by fully taking advantage of the GPU device characteristics with the aim of minimizing the involved computational overhead. The article shows, compares, and analyses the experimental results obtained by applying the proposed approach and several static, dynamic, and semi-dynamic techniques at the state of the art to different benchmarks and over different GPU technologies (i.e., NVIDIA Fermi, Kepler, and Maxwell) to understand when and how each technique best applies.
机译:在实现任何有效的GPU应用程序时,工作负载分区以及随后的工作项到线程的映射都是要面对的关键方面。已经提出了不同的技术来解决这些问题,从计算上最简单的静态到最复杂的动态。它们中的每一个都根据工作负载特征找到最佳用途(对于更常规的工作负载为静态,对于不规则的工作负载为动态)。但是,在这两种情况下应用时,没有一个提供合理的权衡。静态方法会导致带有不规则问题的负载不平衡,而动态或半动态方法引入的计算开销通常会在遇到常规问题时使整体应用程序性能恶化。本文提出了一种有效的动态技术,用于工作负载分区和工作项到线程的映射,相对于文献中的其他动态方法,其复杂性大大降低了。本文展示了如何充分利用GPU设备的特性来实现分区和映射算法,从而最大程度地减少所涉及的计算开销。本文展示,比较和分析了通过将建议的方法以及几种最新的静态,动态和半动态技术应用于不同的基准和不同的GPU技术(例如NVIDIA Fermi,Kepler,和Maxwell),以了解何时以及如何最好地应用每种技术。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号