...
首页> 外文期刊>ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages >Deep Neural Networks Compiler for a Trace-Based Accelerator (Short WIP Paper)
【24h】

Deep Neural Networks Compiler for a Trace-Based Accelerator (Short WIP Paper)

机译:深度神经网络编译器,用于基于轨迹的加速器(短壁纸)

获取原文
获取原文并翻译 | 示例
           

摘要

Deep Neural Networks (DNNs) are the algorithm of choice for image processing applications. DNNs present highly parallel workloads that lead to the emergence of custom hard- ware accelerators. Deep Learning (DL) models specialized in differenttasksrequireaprogrammablecustomhardwareand a compiler/mapper to efficiently translate different DNNs into an efficient dataflow in the accelerator. The goal of this paper is to present a compiler for running DNNs on Snowflake, which is a programmable hardware accelerator that targets DNNs. The compiler correctly generates instructions for various DL models: AlexNet, VGG, ResNet and LightCNN9.Snowflake,withavaryingnumberofprocessing units, was implemented on FPGA to measure the compiler and Snowflake performance properties upon scaling up. The system achieves 70frames/s and 4 . 5GB/s of off-chip memory bandwidth for AlexNet without linear layers on Xilinx's Zynq-SoC XC7Z045 FPGA.
机译:深度神经网络(DNN)是图像处理应用的选择算法。 DNN呈现出高度平行的工作负载,从而导致定制硬件加速器的出现。 深入学习(DL)模型专门从事不同的塔基QuireapramgableAnteAntoMustomhardwareand一个编译器/映射器,以便将不同的DNN有效地将不同的DNN转换为Accelerator中的一个有效的DataFlow。 本文的目标是展示用于在雪花上运行DNN的编译器,这是一个针对DNN的可编程硬件加速器。 编译器正确生成了各种DL型号的说明:alexNet,VGG,Reset和LightCN9.SnowFlake在FPGA上实现了在FPGA上实现了在缩放时测量编译器和雪花性能特性。 该系统实现了70帧/ s和4。 5GB / S用于XILINX的Zynq-SoC XC7Z045 FPGA上的无线图形的alexNet的芯片内存带宽。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号