【24h】

GPU-Job Migration: The rCUDA Case

机译:GPU作业迁移:rCUDA案例

获取原文
获取原文并翻译 | 示例
           

摘要

Virtualization techniques have been shown to report benefits to data centers and other computing facilities. In this regard, not only virtual machines allow to reduce the size of the computing infrastructure while increasing overall resource utilization, but also virtualizing individual components of computers may provide significant benefits. This is the case, for instance, for the remote GPU virtualization technique, implemented in several frameworks during the recent years. The large degree of flexibility provided by the remote GPU virtualization technique can be further increased by applying the migration mechanism to it, so that the GPU part of applications can be live-migrated to another GPU elsewhere in the cluster during execution time in a transparent way. In this paper we present the implementation of the migration mechanism within the rCUDA remote GPU virtualization middleware. Furthermore, we present a thorough performance analysis of the implementation of the migration mechanism within rCUDA. To that end, we leverage both synthetic and real production applications as well as three different generations of NVIDIA GPUs. Additionally, two different versions of the InfiniBand interconnect are used in this study. Several use cases are provided in order to show the extraordinary benefits that the GPU-job migration mechanism can report to data centers.
机译:事实证明,虚拟化技术可向数据中心和其他计算设施报告收益。在这方面,不仅虚拟机允许减小计算基础设施的大小,同时增加整体资源的利用率,而且虚拟化计算机的各个组件可能会带来巨大的好处。例如,近几年在多个框架中实施的远程GPU虚拟化技术就是这种情况。通过将迁移机制应用到远程GPU虚拟化技术,可以进一步提高其灵活性,从而可以在执行期间以透明方式将应用程序的GPU部分实时迁移到集群中其他地方的GPU。 。在本文中,我们介绍了rCUDA远程GPU虚拟化中间件中迁移机制的实现。此外,我们对rCUDA中的迁移机制的实施进行了全面的性能分析。为此,我们同时利用了合成和实际生产应用程序以及三代不同的NVIDIA GPU。此外,本研究中使用了两种不同版本的InfiniBand互连。提供了几个用例,以显示GPU作业迁移机制可以向数据中心报告的非凡优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号