首页> 外文会议>IEEE International Parallel and Distributed Processing Symposium >A Study of Single and Multi-device Synchronization Methods in Nvidia GPUs
【24h】

A Study of Single and Multi-device Synchronization Methods in Nvidia GPUs

机译:Nvidia GPU中的单设备和多设备同步方法的研究

获取原文

摘要

GPUs are playing an increasingly important role in general-purpose computing. Many algorithms require synchronizations at different levels of granularity in a single GPU. Additionally, the emergence of dense GPU nodes also calls for multi-GPU synchronization. Nvidia’s latest CUDA provides a variety of synchronization methods. Until now, there is no full understanding of the characteristics of those synchronization methods. This work explores important undocumented features and provides an in-depth analysis of the performance considerations and pitfalls of the state-of-art synchronization methods for Nvidia GPUs. The provided analysis would be useful when making design choices for applications, libraries, and frameworks running on single and/or multi-GPU environments. We provide a case study of the commonly used reduction operator to illustrate how the knowledge gained in our analysis can be useful. We also describe our micro-benchmarks and measurement methods.
机译:GPU在通用计算中扮演着越来越重要的角色。许多算法需要在单个GPU中以不同的粒度级别进行同步。此外,密集GPU节点的出现也要求进行多GPU同步。 Nvidia的最新CUDA提供了多种同步方法。到目前为止,还没有完全了解这些同步方法的特征。这项工作探索了重要的未记录功能,并对Nvidia GPU的最新同步方法的性能注意事项和陷阱进行了深入分析。当为在单GPU和/或多GPU环境中运行的应用程序,库和框架做出设计选择时,提供的分析将很有用。我们提供了一个常用还原算子的案例研究,以说明在我们的分析中获得的知识如何有用。我们还将描述我们的微基准和测量方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号