...
首页> 外文期刊>IEEE Journal of Solid-State Circuits >An Efficient Deep-Learning-Based Super-Resolution Accelerating SoC With Heterogeneous Accelerating and Hierarchical Cache
【24h】

An Efficient Deep-Learning-Based Super-Resolution Accelerating SoC With Heterogeneous Accelerating and Hierarchical Cache

机译:An Efficient Deep-Learning-Based Super-Resolution Accelerating SoC With Heterogeneous Accelerating and Hierarchical Cache

获取原文
获取原文并翻译 | 示例
           

摘要

This article presents an energy-efficient accelerating system-on-chip (SoC) for super-resolution (SR) image reconstruction on a mobile platform. With the rise of contactless communication and streaming services, the need for SR is growing. As one of the most basic low-level image processing algorithms, SR can reconstruct high-quality images from low-quality images which are noisy, compressed, or with damaged pixels. However, a massive amount of computation and considerable precision of pixel data pose challenges for acceleration in a resource and bandwidth constrained platform. SR has high energy consumption and long latency. While previous neural processing units (NPUs) reduced the precision to increase the efficiency and accelerate convolutional neural network (CNN) computation, few of them concentrated on both the output image quality and the performance of the entire system. The proposed SR SoC restores the high-quality image using a precision-optimized SR algorithm on an energy-efficient accelerating architecture and cache subsystem. It contributes three algorithm-hardware co-optimized features: 1) heterogeneous accelerating architecture (HAA) with only 8-bit floating-point (FP)-and-fixed-point (FXP) hybrid-precision for SR task; 2) tile-based hierarchical cache (THC) subsystem for the low energy and small footprint cost layer fusion; and 3) heterogeneous L1 data lifetime-aware optimized cache (DLOC) for the energy-efficient on-chip memory access. The prototype of SR SoC is fabricated in 65-nm technology and occupies a 10.0-mm2 die area. The proposed SR SoC can maintain the high reconstruction quality while consuming only 19% of the energy of an FXP16 system with homogeneous NPU. As a result, the SR SoC presents $2.6times $ higher energy efficiency than the previous SR targeting NPU and achieves 107-frame-per-second (fps) framerates running $4times $ SR image generation to full high definition (FHD) scale at only 0.92-mJ/frame energy consumption.

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号