首页> 外文会议>IEEE International Parallel and Distributed Processing Symposium >Taming the 'Monster': Overcoming Program Optimization Challenges on SW26010 Through Precise Performance Modeling
【24h】

Taming the 'Monster': Overcoming Program Optimization Challenges on SW26010 Through Precise Performance Modeling

机译:驯服“怪兽”:通过精确的性能建模克服SW26010上的程序优化挑战

获取原文

摘要

This paper presents an effort for overcoming the complexities of program optimizations on SW26010, the heterogeneous many-core processor that powers Sunway TaihuLight, the world top one supercomputer. The solution centers around a precise, static performance model for modern many-core processor. Through a careful design that leverages the special properties of SW26010 and an effective treatment to massive parallelism, the model achieves a high accuracy, showing less than 5% average errors in estimating program execution performance. The precise performance model opens many opportunities for analyzing and guiding code optimizations. The paper demonstrates the usefulness by revealing a series of insights on the effects of some important code optimizations on SW26010. Moreover, it demonstrates that with such a precise performance model, it is feasible to replace empirical auto-tuning with static auto-tuning for optimizing regular loops on heterogeneous many-core systems. Such a replacement speeds up the tuning process by as much as a factor of 43 while keeping the tuning quality loss below 6%.
机译:本文提出了克服SW26010程序优化复杂性的努力,SW26010是为世界一流的超级计算机Sunway TaihuLight提供支持的异构多核处理器。该解决方案围绕着针对现代多核处理器的精确静态性能模型。通过精心设计,充分利用了SW26010的特殊性能,并对大规模并行性进行了有效处理,该模型实现了很高的精度,在估计程序执行性能时显示出不到5%的平均错误。精确的性能模型为分析和指导代码优化提供了许多机会。本文通过揭示一些重要的代码优化对SW26010的影响的见解来证明其有用性。此外,它表明,使用这种精确的性能模型,用静态自动调整替换经验自动调整以优化异构多核系统上的规则循环是可行的。这样的替换可将调整过程最多提高43倍,同时将调整质量损失保持在6%以下。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号