首页> 外文会议> >Performance modeling and measurement of parallelized code for distributed shared memory multiprocessors
【24h】

Performance modeling and measurement of parallelized code for distributed shared memory multiprocessors

机译:分布式共享内存多处理器并行代码的性能建模和测量

获取原文

摘要

This paper presents a model to evaluate the performance and overhead of parallelizing sequential code using compiler directives for multiprocessing on distributed shared memory (DSM) systems. We parallelized the sequential implementation of NAS benchmarks using native Fortran77 compiler directives on an Origin2000, which is a DSM system. We report measurement based performance of these parallelized benchmarks from four perspectives: efficacy of parallelization process; scalability; parallelization overhead; and comparison with hand-parallelized and -optimized version of the same benchmarks. Our results indicate that sequential programs can conveniently be parallelized for DSM systems using compiler directives but realizing performance gains as predicted by the performance model depends primarily on minimizing architecture-specific data locality overhead.
机译:本文提出了一个模型,用于评估使用编译器指令在分布式共享内存(DSM)系统上进行多处理的并行化顺序代码的性能和开销。我们在Origin2000(DSM系统)上使用本机Fortran77编译器指令并行执行了NAS基准测试的顺序执行。我们从四个角度报告了这些并行基准测试的基于测量的性能:并行处理的效率;可扩展性;并行化开销;并与相同基准的手动并行化和优化版本进行比较。我们的结果表明,可以使用编译器指令为DSM系统方便地并行执行顺序程序,但是要实现性能模型所预测的性能提升,主要取决于最小化特定于体系结构的数据局部性开销。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号