首页> 外文期刊>Computer Communications >A deep neural network compression algorithm based on knowledge transfer for edge devices
【24h】

A deep neural network compression algorithm based on knowledge transfer for edge devices

机译:基于边缘设备知识传输的深度神经网络压缩算法

获取原文
获取原文并翻译 | 示例
           

摘要

The computation and storage capacity of the edge device are limited, which seriously restrict the application of deep neural network in the device. Toward to the intelligent application of the edge device, we introduce the deep neural network compression algorithm based on knowledge transfer, a three-stage pipeline: lightweight, multi-level knowledge transfer and pruning that reduce the network depth, parameter and operation complexity of the deep learning neural networks. We lighten the neural networks by using a global average pooling layer instead of a fully connected layer and replacing a standard convolution with separable convolutions. Next, the multi-level knowledge transfer minimizes the difference between the output of the "student network" and the "teacher network" in the middle and logits layer, increasing the supervised information when training the "student network". Lastly, we prune the network by cutting off the unimportant convolution kernels with a global iterative pruning strategy. The experiment results show that the proposed method improve the efficiency up to 30% than the knowledge distillation method in reducing the loss of classification performance. Benchmarked on GPU (Graphics Processing Unit) server, Raspberry Pi 3 and Cambricon-1A, the parameters of the compressed network after using our knowledge transfer and pruning method have achieved more than 49.5 times compression and the time efficiency of a single feedforward operation has been improved more than 3.2 times.
机译:边缘设备的计算和存储容量是有限的,这严重限制了设备中深神经网络的应用。朝向边缘设备的智能应用,我们介绍了基于知识转移的深度神经网络压缩算法,三级管道:轻量级,多级知识传输和修剪,减少网络深度,参数和操作复杂性深度学习神经网络。我们通过使用全局平均池化层而不是完全连接的图层来降低神经网络,并用可分离卷曲取代标准卷积。接下来,多级知识转移最小化中间和注册层中“学生网络”和“教师网络”的输出之间的差异,在培训“学生网络”时增加了监督信息。最后,我们通过用全球迭代修剪策略切断不重要的卷积内核来修剪网络。实验结果表明,该方法提高了高达30%的效率,高于知识蒸馏方法降低了分类性能的损失。在GPU(图形处理单元)服务器上进行基准测试,Raspberry PI 3和Cambricon-1a,使用我们知识传输和修剪方法后压缩网络的参数实现了超过49.5倍的压缩,并且一直馈电操作的时间效率提高了3.2倍以上。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号