A Dynamic Approach for Workload Partitioning on GPU Architectures

Federico Busato; Nicola Bombieri

首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >A Dynamic Approach for Workload Partitioning on GPU Architectures

【24h】

A Dynamic Approach for Workload Partitioning on GPU Architectures

机译：在GPU架构上进行工作负载分区的动态方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Workload partitioning and the subsequent work item-to-thread mapping are key aspects to face when implementing any efficient GPU application. Different techniques have been proposed to deal with such issues, ranging from the computationally simplest static to the most complex dynamic ones. Each of them finds the best use depending on the workload characteristics (static for more regular workloads, dynamic for irregular workloads). Nevertheless, no one of them provides a sound tradeoff when applied in both cases. Static approaches lead to load unbalancing with irregular problems, while the computational overhead introduced by the dynamic or semi-dynamic approaches often worsens the overall application performance when run on regular problems. This article presents an efficient dynamic technique for workload partitioning and work item-to-thread mapping whose complexity is significantly reduced with respect to the other dynamic approaches in literature. The article shows how the partitioning and mapping algorithm has been implemented by fully taking advantage of the GPU device characteristics with the aim of minimizing the involved computational overhead. The article shows, compares, and analyses the experimental results obtained by applying the proposed approach and several static, dynamic, and semi-dynamic techniques at the state of the art to different benchmarks and over different GPU technologies (i.e., NVIDIA Fermi, Kepler, and Maxwell) to understand when and how each technique best applies.

机译：在实现任何有效的GPU应用程序时，工作负载分区以及随后的工作项到线程的映射都是要面对的关键方面。已经提出了不同的技术来解决这些问题，从计算上最简单的静态到最复杂的动态。它们中的每一个都根据工作负载特征找到最佳用途（对于更常规的工作负载为静态，对于不规则的工作负载为动态）。但是，在这两种情况下应用时，没有一个提供合理的权衡。静态方法会导致带有不规则问题的负载不平衡，而动态或半动态方法引入的计算开销通常会在遇到常规问题时使整体应用程序性能恶化。本文提出了一种有效的动态技术，用于工作负载分区和工作项到线程的映射，相对于文献中的其他动态方法，其复杂性大大降低了。本文展示了如何充分利用GPU设备的特性来实现分区和映射算法，从而最大程度地减少所涉及的计算开销。本文展示，比较和分析了通过将建议的方法以及几种最新的静态，动态和半动态技术应用于不同的基准和不同的GPU技术（例如NVIDIA Fermi，Kepler，和Maxwell），以了解何时以及如何最好地应用每种技术。

著录项

来源
《IEEE Transactions on Parallel and Distributed Systems》 |2017年第6期|1535-1549|共15页
作者
Federico Busato; Nicola Bombieri;
展开▼
作者单位

Department of Computer Science, University of Verona, VR, Italy;

Department of Computer Science, University of Verona, VR, Italy;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Graphics processing units; Instruction sets; Indexes; Heuristic algorithms; Message systems; Parallel processing; Benchmark testing;

机译：图形处理单元;指令集;索引;启发式算法;消息系统;并行处理;基准测试;

相似文献

外文文献
中文文献
专利

1. Parallel high-dimensional multi-objective feature selection for EEG classification with dynamic workload balancing on CPU-GPU architectures [J] . Escobar Juan Jose, Ortega Julio, Gonzalez Jesus, Cluster computing . 2017,第3期

机译：CPU-GPU架构上的动态工作负载均衡的eEG分类并行高维多目标功能选择
2. Energy conservation for GPU-CPU architectures with dynamic workload division and frequency scaling [J] . Kai Ma, Yunhao Bai, Xiaorui Wang, Sustainable Computing . 2016,第DECa期

机译：具有动态工作负载划分和频率缩放功能的GPU-CPU架构节能
3. Semi-Partitioned Scheduling of Dynamic Real-Time Workload: A Practical Approach Based on Analysis-Driven Load Balancing [J] . Daniel Casini, Alessandro Biondi, Giorgio Buttazzo LIPIcs : Leibniz International Proceedings in Informatics . 2017,第29期

机译：动态实时工作负载的半分区调度：一种基于分析驱动的负载平衡的实用方法
4. Process variation-aware workload partitioning algorithms for GPUs supporting spatial-multitasking [C] . Aguilera, Paula, Lee, Jungseob, Farmahini-Farahani, Amin, Design, Automation & Test in Europe Conference and Exhibition . 2014

机译：支持空间多任务处理的GPU的可识别流程变化的工作负载分区算法
5. Architectural and Runtime Enhancements for Dynamically Controlled Multi-Level Concurrency on GPUs. [D] . Ukidave, Yash. 2015

机译：在GPU上实现动态控制的多层并发的体系结构和运行时增强。
6. A novel approach to locomotion learning: Actor-Critic architecture using central pattern generators and dynamic motor primitives [O] . Cai Li, Robert Lowe, Tom Ziemke 2014

机译：运动学习的新方法：使用中央模式生成器和动态运动原语的Actor-Critic体系结构
7. Semi-Partitioned Scheduling of Dynamic Real-Time Workload: A Practical Approach Based on Analysis-Driven Load Balancing [O] . Casini Daniel, Biondi Alessandro, Buttazzo Giorgio 2017

机译：动态实时工作负荷的半分区调度：一种基于分析驱动的负载均衡的实用方法

A Dynamic Approach for Workload Partitioning on GPU Architectures

摘要

著录项

相似文献

相关主题

期刊订阅