Decoupled Processors Architecture for Accelerating Data Intensive Applications using Scratch-Pad Memory Hierarchy

Athanasios Milidonis; rnNikolaos Alachiotis; Vasileios Porpodas; rnHarris Michail; Georgios Panagiotakopoulos; Athanasios P. Kakarountas; rnCostas E. Goutis

首页> 外文期刊>Journal of signal processing systems for signal, image, and video technology >Decoupled Processors Architecture for Accelerating Data Intensive Applications using Scratch-Pad Memory Hierarchy

【24h】

Decoupled Processors Architecture for Accelerating Data Intensive Applications using Scratch-Pad Memory Hierarchy

机译：使用便签式存储器层次结构加速数据密集型应用程序的解耦处理器体系结构

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present an architecture of decoupled processors with a memory hierarchy consisting only of scratch-pad memories, and a main memory. This architecture exploits the more efficient pre-fetching of Decoupled processors, that make use of the parallelism between address computation and application data processing, which mainly exists in streaming applications. This benefit combined with the ability of scratch-pad memories to store data with no conflict misses and low energy per access contributes significantly for increasing the system's performance. The application code is split in two parallel programs the first runs on the Access processor and computes the addresses of the data in the memory hierarchy. The second processes the application data and runs on the Execute processor, a processor with a limited address space-just the register file addresses. Each transfer of any block in the memory hierarchy up to the Execute processor's register file is controlled by the Access processor and the DMA units. This strongly differentiates this architecture from traditional uniprocessors and existing decoupled processors with cache memory hierarchies. The architecture is compared in performance with uniprocessor architectures with (a) scratch-pad and (b) cache memory hierarchies and (c) the existing decoupled architectures, showing its higher normalized performance. The reason for this gain is the efficiency of data transferring that the scratch-pad memory hierarchy provides combined with the ability of the Decoupled processors to eliminate memory latency using memory management techniques for transferring datarninstead of fixed prefetching methods. Experimental results show that the performance is increased up to almost 2 times compared to uniprocessor architectures with scratch-pad and up to 3.7 times compared to the ones with cache. The proposed architecture achieves the above performance without having penalties in energy delay product costs.

机译：我们提出了一种解耦处理器的体系结构，其存储器层次结构仅由暂存存储器和主存储器组成。该体系结构利用了解耦处理器的更有效预取，该预取处理器利用了地址计算和应用程序数据处理之间的并行性，这种并行性主要存在于流应用程序中。这种好处与暂存器存储数据的能力相结合而不会发生冲突遗漏，并且每次访问的能量都很低，从而极大地提高了系统的性能。应用程序代码分为两个并行程序，第一个并行程序在Access处理器上运行，并计算内存层次结构中的数据地址。第二个过程处理应用程序数据并在执行处理器上运行，该处理器是地址空间有限的处理器-仅寄存器文件地址。存储器层次结构中直到执行处理器的寄存器文件的任何块的每次传输都由访问处理器和DMA单元控制。这使该体系结构与传统的单处理器和具有高速缓存存储器层次结构的现有解耦处理器有很大区别。将该架构的性能与具有（a）暂存器和（b）高速缓存存储器层次结构以及（c）现有解耦架构的单处理器架构进行了比较，显示了其更高的归一化性能。此增益的原因是暂存内存层次结构提供的数据传输效率与解耦处理器使用内存管理技术（而不是固定预取方法）传输数据的消除内存延迟的能力相结合。实验结果表明，与具有暂存器的单处理器体系结构相比，性能提高了近2倍，与具有缓存的体系结构相比，性能提高了3.7倍。所提出的架构实现了以上性能，而没有对能量延迟产品成本造成损失。

著录项

来源
《Journal of signal processing systems for signal, image, and video technology》 |2010年第3期|p.281-296|共16页
作者
Athanasios Milidonis; rnNikolaos Alachiotis; Vasileios Porpodas; rnHarris Michail; Georgios Panagiotakopoulos; Athanasios P. Kakarountas; rnCostas E. Goutis;
展开▼
作者单位

VLSI Design Lab., Electrical & Computer Engineering Department, University of Patras, Patras, Greece;

rnVLSI Design Lab., Electrical & Computer Engineering Department, University of Patras, Patras, Greece;

rnVLSI Design Lab., Electrical & Computer Engineering Department, University of Patras, Patras, Greece;

rnVLSI Design Lab., Electrical & Computer Engineering Department, University of Patras, Patras, Greece;

rnVLSI Design Lab., Electrical & Computer Engineering Department, University of Patras, Patras, Greece;

rnVLSI Design Lab., Electrical & Computer Engineering Department, University of Patras, Patras, Greece;

rnVLSI Design Lab., Electrical & Computer Engineering Department, University of Patras, Patras, Greece;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
decoupled; scratch pad;

机译：解耦便笺;

相似文献

外文文献
中文文献
专利

1. Decoupled Processors Architecture for Accelerating Data Intensive Applications using Scratch-Pad Memory Hierarchy [J] . Athanasios Milidonis, Nikolaos Alachiotis, Vasileios Porpodas, Journal of Signal Processing Systems . 2010,第3期

机译：使用便签式存储器层次结构加速数据密集型应用程序的解耦处理器体系结构
2. A Near-memory Processing Architecture on FPGAs for Data Movement Intensive Applications [J] . Vu Hoang Gia, Thi Hong Tran, Shinya Takamaeda, 電子情報通信学会技術研究報告. リコンフィギャラブルシステム. Reconfigurable Systems . 2015,第109期

机译：FPGA上用于数据移动密集型应用的近内存处理架构
3. Data-Reuse-Driven Energy-Aware Cosynthesis of Scratch Pad Memory and Hierarchical Bus-Based Communication Architecture for Multiprocessor Streaming Applications [J] . Issenin I., Brockmeyer E., Durinck B., IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems . 2008,第8期

机译：暂存器数据的数据重用驱动的能量感知综合与基于分层总线的多处理器流应用通信架构
4. Interactive presentation: A decoupled architecture of processors with scratch-pad memory hierarchy [C] . A. Milidonis, N. Alachiotis, V. Porpodas, EDBT/ICDT Workshops . 2009

机译：交互式演示：具有暂存内存层次结构的处理器的分离架构
5. Efficient PIM (Processor-In-Memory) architectures for data-intensive applications. [D] . Kang, Jung-Yup. 2004

机译：适用于数据密集型应用程序的高效PIM（内存中处理器）体系结构。
6. A Processing-in-Memory Architecture Programming Paradigm for Wireless Internet-of-Things Applications [O] . Xu Yang, Yumin Hou, Hu He 2019

机译：无线物联网应用的内存中处理架构编程范例
7. A decoupled architecture of processors with scratch-pad memory hierarchy [O] . A. Milidonis, N. Alachiotis, V. Porpodas, 2007

机译：具有便笺式存储器层次结构的处理器的解耦架构
8. Systeme Memoire pour Architecture Multiprocesseur sur Bus Unique. Application au Systeme SCQM (Memory Systems for Single Bus Multiprocessor Architecture. Application to the SCQM System) [R] . Cekleov, M. 1986

机译：systeme memoire pour architecture multiprocesseur sur Bus Unique。应用程序au systeme sCQm（用于单总线多处理器体系结构的存储器系统。应用于sCQm系统）

Decoupled Processors Architecture for Accelerating Data Intensive Applications using Scratch-Pad Memory Hierarchy

摘要

著录项

相似文献

相关主题

期刊订阅