首页> 外文会议>2012 11th International Symposium on Parallel and Distributed Computing. >Automatic Optimization of In-Flight Memory Transactions for GPU Accelerators Based on a Domain-Specific Language for Medical Imaging
【24h】

Automatic Optimization of In-Flight Memory Transactions for GPU Accelerators Based on a Domain-Specific Language for Medical Imaging

机译:基于领域特定语言的医学成像,用于GPU加速器的飞行中内存事务自动优化

获取原文
获取原文并翻译 | 示例

摘要

An efficient memory bandwidth utilization for GPU accelerators is crucial for memory bound applications. In medical imaging, the performance of many kernels is limited by the available memory bandwidth since only a few operations are performed per pixel. For such kernels only a fraction of the compute power provided by GPU accelerators can be exploited and performance is predetermined by memory bandwidth. As a remedy, this paper investigates the optimal utilization of available memory bandwidth by means of increasing in-flight memory transactions. Instead of doing this manually for different GPU accelerators, the required CUDA and OpenCL code is automatically generated from descriptions in a Domain-Specific Language (DSL) for the considered application domain. Moreover, the DSL is extended to also support global reduction operators. We show that the generated target-specific code improves bandwidth utilization for memory-bound kernels significantly. Moreover, competitive performance compared to the GPU back end of the widely used image processing library OpenCV can be achieved.
机译:GPU加速器的有效内存带宽利用率对于内存绑定应用程序至关重要。在医学成像中,许多内核的性能受到可用内存带宽的限制,因为每个像素仅执行少量操作。对于此类内核,只能利用GPU加速器提供的一部分计算能力,而性能由内存带宽预先确定。作为补救措施,本文通过增加飞行中的内存事务来研究可用内存带宽的最佳利用。代替手动为不同的GPU加速器执行此操作,所需的CUDA和OpenCL代码是根据所考虑的应用程序域的特定于域的语言(DSL)的描述自动生成的。此外,DSL已扩展为也支持全局还原运营商。我们表明,生成的特定于目标的代码可显着提高内存绑定内核的带宽利用率。而且,与广泛使用的图像处理库OpenCV的GPU后端相比,可以实现竞争性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号