首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Workload-aware Automatic Parallelization for Multi-GPU DNN Training
【24h】

Workload-aware Automatic Parallelization for Multi-GPU DNN Training

机译:用于多GPU DNN培训的可感知工作负载的自动并行化

获取原文

摘要

Deep neural networks (DNNs) have emerged as successful solutions for variety of artificial intelligence applications, but their very large and deep models impose high computational requirements during training. Multi-GPU parallelization is a popular option to accelerate demanding computations in DNN training, but most state-of-the-art multi-GPU deep learning frameworks not only require users to have an in-depth understanding of the implementation of the frameworks themselves, but also apply parallelization in a straight-forward way without optimizing GPU utilization. In this work, we propose a workload-aware auto-parallelization framework (WAP) for DNN training, where the work is automatically distributed to multiple GPUs based on the workload characteristics. We evaluate WAP using TensorFlow with popular DNN benchmarks (AlexNet and VGG-16), and show competitive training throughput compared with the state-of-the-art frameworks, and also demonstrate that WAP automatically optimizes GPU assignment based on the workload's compute requirements, thereby improving energy efficiency.
机译:深度神经网络(DNN)已成为各种人工智能应用的成功解决方案,但它们的超大型模型和深度模型在训练过程中对计算量提出了很高的要求。多GPU并行化是在DNN培训中加速要求苛刻的计算的一种流行选择,但是大多数最新的多GPU深度学习框架不仅要求用户对框架本身的实现有深入的了解,而且还可以在不优化GPU利用率的情况下直接应用并行化。在这项工作中,我们提出了一种用于DNN培训的工作负载感知自动并行化框架(WAP),其中该工作会根据工作负载特征自动分配到多个GPU。我们使用TensorFlow和流行的DNN基准(AlexNet和VGG-16)对WAP进行评估,并与最先进的框架进行比较,展示出具有竞争力的培训吞吐量,并证明WAP根据工作量的计算要求自动优化GPU分配,从而提高能源效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号