首页> 外文会议>IEEE International Conference on Distributed Computing Systems >Context-Aware Deep Model Compression for Edge Cloud Computing
【24h】

Context-Aware Deep Model Compression for Edge Cloud Computing

机译:关于边缘云计算的背景感知深模模型压缩

获取原文

摘要

While deep neural networks (DNNs) have led to a paradigm shift, its exorbitant computational requirement has always been a roadblock in its deployment to the edge, such as wearable devices and smartphones. Hence a hybrid edge-cloud computational framework is proposed to transfer part of the computation to the cloud, by naively partitioning the DNN operations under the constant network condition assumption. However, real-world network state varies greatly depending on the context, and DNN partitioning only has limited strategy space. In this paper, we explore the structural flexibility of DNN to fit the edge model to varying network contexts and different deployment platforms. Specifically, we designed a reinforcement learning-based decision engine to search for model transformation strategies in response to a combined objective of model accuracy and computation latency. The engine generates a context-aware model tree so that the DNN can decide the model branch to switch to at runtime. By the emulation and field experimental results, our approach enjoys a 30% − 50% latency reduction while retaining the model accuracy.
机译:虽然深度神经网络(DNN)导致范式转变,但其过高的计算要求一直是其部署到边缘的障碍,例如可穿戴设备和智能手机。因此,提出了一种混合边缘云计算框架来将部分计算到云,通过恒定网络条件假设下的DNN操作天然划分DNN操作。但是,实际网络状态根据上下文而变化大大变化,DNN分区仅具有有限的策略空间。在本文中,我们探讨了DNN将边缘模型与不同网络上下文和不同部署平台的结构灵活性。具体而言,我们设计了一种基于加强学习的决策引擎,以响应于模型精度和计算延迟的组合目标来搜索模型转换策略。该引擎生成上下文感知模型树,使DNN可以决定在运行时切换到模型分支。通过仿真和现场实验结果,我们的方法在保持模型精度的同时享有30% - 50%的延迟减少。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号