首页> 外文会议>IEEE International Symposium on Circuits and Systems >Instruction-based high-efficient synchronization in a many-core Network-on-Chip processor
【24h】

Instruction-based high-efficient synchronization in a many-core Network-on-Chip processor

机译:多核片上网络处理器中基于指令的高效同步

获取原文

摘要

Parallelized applications running on many-core Network-on-Chip (NoC) processors may consume a great part of execution time to synchronize threads mapped on multiple NoC nodes, if synchronization for NoC processors is not carefully designed. In this paper, we propose an instruction-based synchronization solution applied in a packet-switched many-core NoC processor with 2D mesh grid topology. Return links are added into the on-chip network to transmit acknowledgements of read requests, while a specific instruction SET is designed as instruction set extension to the original pipeline to perform atomic read-modify-write operations. To support various synchronization schemes, a hardware unit SYNC containing globally addressable registers as shared variables is adopted to handle synchronization requests from both local and remote NoC nodes. Additionally, a FIFO located in the SYNC unit can store these synchronization requests to poll on shared variables locally. Thus, network contention due to busy-wait synchronization algorithms is greatly reduced. Synchronization schemes including spinlock, barrier, FIFO spinlock and semaphore are implemented as inline assembly functions. Synthesis results under 55nm process suggest low area and power overhead of the hardware design. Performance of synchronization schemes are evaluated and are compared to results of conventional methods and prior works, showing the proposed solution is of higher efficiency.
机译:如果没有精心设计NoC处理器的同步,那么在多核NoC处理器上运行的并行应用程序可能会花费大量执行时间来同步映射到多个NoC节点上的线程。在本文中,我们提出了一种基于指令的同步解决方案,该解决方案适用于具有2D网格网格拓扑的分组交换多核NoC处理器。返回链接被添加到片上网络中以传输读取请求的确认,而特定的指令SET被设计为原始管道的指令集扩展,以执行原子性的读取-修改-写入操作。为了支持各种同步方案,采用包含全局可寻址寄存器作为共享变量的硬件单元SYNC来处理来自本地和远程NoC节点的同步请求。此外,位于SYNC单元中的FIFO可以存储这些同步请求,以在本地轮询共享变量。因此,大大减少了由于繁忙等待同步算法引起的网络争用。包括自旋锁,屏障,FIFO自旋锁和信号灯在内的同步方案被实现为内联汇编功能。 55nm工艺下的综合结果表明,该硬件设计的面积和功耗较低。对同步方案的性能进行了评估,并将其与常规方法和先前工作的结果进行了比较,表明所提出的解决方案具有更高的效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号