首页> 外文会议>IEEE International Conference on Artificial Intelligence Circuits and Systems >End-to-end 100-TOPS/W Inference With Analog In-Memory Computing: Are We There Yet?
【24h】

End-to-end 100-TOPS/W Inference With Analog In-Memory Computing: Are We There Yet?

机译:端到端100-TOPS / W推理模拟内存计算:我们是否有?

获取原文

摘要

In-Memory Acceleration (IMA) promises major efficiency improvements in deep neural network (DNN) inference, but challenges remain in the integration of IMA within a digital system. We propose a heterogeneous architecture coupling 8 RISC-V cores with an IMA in a shared-memory cluster, analyzing the benefits and trade-offs of in-memory computing on the realistic use case of a MobileNetV2 bottleneck layer. We explore several IMA integration strategies, analyzing performance, area, and energy efficiency. We show that while pointwise layers achieve significant speed-ups over software implementation, on depthwise layer the inability to efficiently map parameters on the accelerator leads to a significant trade-off between throughput and area. We propose a hybrid solution where pointwise convolutions are executed on IMA while depthwise on the cluster cores, achieving a speed-up of 3x over SW execution while saving 50% of area when compared to an all-in IMA solution with similar performance.
机译:内部内存加速度(IMA)承诺在深神经网络(DNN)推断中的主要效率提高,但挑战仍然存在于数字系统中的IMA的集成。 我们提出了一种异构架构,在共享存储器集群中用IMA提出了一个异构的架构耦合8 risc-v核心,分析了MobileNetv2瓶颈层的现实用例中内存计算的益处和权衡。 我们探讨了几种IMA集成策略,分析了性能,区域和能效。 我们表明,虽然点层在软件实现上实现了显着的速度,但在深度层上,无法有效地在加速器上映射参数,导致吞吐量和区域之间的显着折衷。 我们提出了一种混合解决方案,其中在IMA上在IMA上执行了一个混合解决方案,同时深入在群集核心上,在与具有类似性能的All-IMA解决方案相比,在SW执行上实现3倍的加速。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号