首页> 外文会议>42th Annual International Symposium on Computer Architecture >DjiNN and Tonic: DNN as a service and its implications for future warehouse scale computers
【24h】

DjiNN and Tonic: DNN as a service and its implications for future warehouse scale computers

机译:DjiNN和Tonic:DNN即服务及其对未来仓库规模计算机的影响

获取原文
获取原文并翻译 | 示例

摘要

As applications such as Apple Siri, Google Now, Microsoft Cortana, and Amazon Echo continue to gain traction, webservice companies are adopting large deep neural networks (DNN) for machine learning challenges such as image processing, speech recognition, natural language processing, among others. A number of open questions arise as to the design of a server platform specialized for DNN and how modern warehouse scale computers (WSCs) should be outfitted to provide DNN as a service for these applications. In this paper, we present DjiNN, an open infrastructure for DNN as a service in WSCs, and Tonic Suite, a suite of 7 end-to-end applications that span image, speech, and language processing. We use DjiNN to design a high throughput DNN system based on massive GPU server designs and provide insights as to the varying characteristics across applications. After studying the throughput, bandwidth, and power properties of DjiNN and Tonic Suite, we investigate several design points for future WSC architectures. We investigate the total cost of ownership implications of having a WSC with a disaggregated GPU pool versus a WSC composed of homogeneous integrated GPU servers. We improve DNN throughput by over 120× for all but one application (40× for Facial Recognition) on an NVIDIA K40 GPU. On a GPU server composed of 8 NVIDIA K40s, we achieve near-linear scaling (around 1000× throughput improvement) for 3 of the 7 applications. Through our analysis, we also find that GPU-enabled WSCs improve total cost of ownership over CPU-only designs by 4–20×, depending on the composition of the workload.
机译:随着诸如Apple Siri,Google Now,Microsoft Cortana和Amazon Echo之类的应用程序继续受到青睐,网络服务公司正在采用大型深度神经网络(DNN)来应对诸如图像处理,语音识别,自然语言处理等机器学习挑战。 。关于专用于DNN的服务器平台的设计以及应如何配备现代仓库规模计算机(WSC)来提供DNN作为这些应用程序的服务,存在许多悬而未决的问题。在本文中,我们介绍了DjiNN(一种在WSC中将DNN作为服务使用的开放基础结构)和Tonic Suite(一套涵盖图像,语音和语言处理的7个端到端应用程序的套件)。我们使用DjiNN基于大规模GPU服务器设计来设计高吞吐量DNN系统,并提供有关跨应用程序变化特征的见解。在研究了DjiNN和Tonic Suite的吞吐量,带宽和功率属性之后,我们研究了未来WSC​​架构的几个设计要点。我们调查了拥有具有分类GPU池的WSC与由同类集成GPU服务器组成的WSC所涉及的总拥有成本。我们将NVIDIA K40 GPU上除一个应用程序外的所有应用程序(面部识别功能提高了40倍)将DNN吞吐量提高了120倍以上。在由8个NVIDIA K40组成的GPU服务器上,我们为7个应用程序中的3个实现了近线性缩放(吞吐量提高了约1000倍)。通过我们的分析,我们还发现,基于GPU的WSC与仅采用CPU的设计相比,可将总拥有成本降低4–20倍,具体取决于工作负载的组成。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号