...
首页> 外文期刊>International journal of parallel programming >MulticoreBSP for C: A High-Performance Library for Shared-Memory Parallel Programming
【24h】

MulticoreBSP for C: A High-Performance Library for Shared-Memory Parallel Programming

机译:MulticoreBSP for C:用于共享内存并行编程的高性能库

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

The bulk synchronous parallel (BSP) model, as well as parallel programming interfaces based on BSP, classically target distributed-memory parallel architectures. In earlier work, Yzelman and Bisseling designed a MulticoreBSP for Java library specifically for shared-memory architectures. In the present article, we further investigate this concept and introduce the new high-performance MulticoreBSP for C library. Among other features, this library supports nested BSP runs. We show that existing BSP software performs well regardless whether it runs on distributed-memory or shared-memory architectures, and show that applications in MulticoreBSP can attain high-performance results. The paper details implementing the Fast Fourier Transform and the sparse matrix-vector multiplication in BSP, both of which outperform state-of-the-art implementations written in other shared-memory parallel programming interfaces. We furthermore study the applicability of BSP when working on highly non-uniform memory access architectures.
机译:批量同步并行(BSP)模型以及基于BSP的并行编程接口,通常是目标分布式内存并行体系结构。在早期的工作中,Yzelman和Bisseling为Java库设计了一个MulticoreBSP,专门用于共享内存体系结构。在本文中,我们将进一步研究此概念,并介绍适用于C库的新型高性能MulticoreBSP。除其他功能外,该库还支持嵌套的BSP运行。我们证明了现有的BSP软件无论在分布式内存架构还是共享内存架构上运行都表现良好,并且表明MulticoreBSP中的应用程序可以实现高性能。本文详细介绍了在BSP中实现快速傅立叶变换和稀疏矩阵矢量乘法的方法,二者均优于其他共享内存并行编程接口中编写的最新实现。我们还研究在高度不统一的内存访问体系结构上工作时BSP的适用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号