Runtime Support for Accelerating CNN Models on Digital DRAM Processing-in-Memory Hardware

Yongwon Shin; Juseong Park; Jeongmin HongHyojin Sung

首页> 外文期刊>IEEE computer architecture letters >Runtime Support for Accelerating CNN Models on Digital DRAM Processing-in-Memory Hardware

【24h】

Runtime Support for Accelerating CNN Models on Digital DRAM Processing-in-Memory Hardware

机译：Runtime Support for Accelerating CNN Models on Digital DRAM Processing-in-Memory Hardware

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相关主题

摘要

Processing-in-memory (PIM) provides promising solutions to the main memory bottleneck by placing computational logic in or near memory devices to reduce data movement overheads. Recent work explored how commercial DRAM can feature digital PIM logic while meeting fab-level energy and area constraints, and showed a significant speedup in the inference time of data-intensive deep learning models. However, convolutional neural network (CNN) models were not considered as main targets for the commercial DRAM-PIM due to their compute-intensive convolution layers. Moreover, recent studies revealed that the area and power constraints on memory die prevent DRAM-PIM from competing with GPUs and specialized accelerators in accelerating them. Recently, mobile CNN models have increasingly adopted a composition of depthwise and pointwise convolutions instead of such compute-intensive convolutions to reduce computation cost without accuracy drop. In this paper, we show that 1x1 convolution can be offloaded for PIM acceleration with integrated runtime support and without any hardware or algorithm changes. We provide further speedup with parallel execution on GPU and DRAM-PIM and code generation optimizations. Our solution achieves up to 35.2 (31.6 on average) speedup for all 1x1 convolutions for mobile CNN models against GPU.

著录项

来源
《IEEE computer architecture letters》 |2022年第2期|33-36|共4页
作者
Yongwon Shin; Juseong Park; Jeongmin HongHyojin Sung;
展开▼
作者单位

Department of Computer Science and Engineering, Graduate School of Artificial Intelligence, Pohang University of Science and Technology, Pohang, South Korea;

展开▼
收录信息
原文格式 PDF
正文语种英语
中图分类 Tp;
关键词
Convolution; Graphics processing units; Runtime; Computational modeling; Random access memory; Hardware; Convolutional neural networks;

Runtime Support for Accelerating CNN Models on Digital DRAM Processing-in-Memory Hardware

摘要

著录项

相关主题

期刊订阅