首页> 外文学位 >Scalable Compiler Optimizations for Improving the Memory System Performance in Multi- and Many-core Processors.
【24h】

Scalable Compiler Optimizations for Improving the Memory System Performance in Multi- and Many-core Processors.

机译:可扩展的编译器优化,用于提高多核和多核处理器中的内存系统性能。

获取原文
获取原文并翻译 | 示例

摘要

The last decade has seen the transition from unicore processors to their multi-core (and now many-core) counterparts. This transition has brought about renewed focus on compiler developers to extract performance from these parallel processors. In addition to extracting parallelism, another important responsibility of a parallelizing (or optimizing) compiler is to improve the memory system performance of the source program. This is particularly important because the multi-cores have accentuated the memory-wall and the bandwidth-wall.;In this thesis, we identify three key challenges facing the compiler developers on current processors. These include, (1) the diverse set of microarchitectures existent at any time, and more importantly, the changes in micrarchitecture between generations. (2) Poor show of compilers in real applications that contain large scope of statements amenable for optimization. (3) Unscalability of compilers - this is a traditional limitation of compilers where the compilers choose to optimize small scopes to contain the compile time and memory requirement, and thus loose optimization opportunities.;In this thesis, we make the following contributions to address the above challenges. (1) We revisit three compiler optimizations (loop tiling and loop fusion for enhancing temporal locality and data prefetching for hiding memory latency) for improving memory (and parallel) performance in light of the various recent advances in microarchitecture, including deeper memory hierarchy, the multithreading technology, the (short-vector) SIMDization technology, and hardware prefetching, and propose generic algorithms implementable in production compilers for a range of processors. (2) We propose wise heuristics in a cost model to choose good statements to fuse, and also improve dependence analysis to not loose critical fusion opportunity in application programs when it exists. (3) The final contribution of this thesis is a solution to the unscalability problem. Based on program semantics, we devise a way to represent the entire program with much fewer representative statements and dependences, leading to significantly improved compile time and memory requirement for compilation. Thus, real applications can now be optimized not only efficiently, but at very low overhead.
机译:在过去的十年中,从单核处理器过渡到了多核(现在是多核)处理器。这种过渡使人们重新关注编译器开发人员,以从这些并行处理器中提取性能。除了提取并行性之外,并行化(或优化)编译器的另一个重要职责是提高源程序的内存系统性能。这一点尤为重要,因为多核加剧了内存壁和带宽壁。在本文中,我们确定了当前处理器上的编译器开发人员面临的三个关键挑战。其中包括:(1)随时存在的各种不同的微体系结构,更重要的是,世代之间微体系结构的变化。 (2)实际应用程序中的编译器显示效果很差,其中包含大范围的可优化语句。 (3)编译器的不可扩展性-这是编译器的传统限制,在这种限制中,编译器选择优化小范围以包含编译时间和内存需求,从而失去优化机会。在本文中,我们做出以下贡献来解决以上挑战。 (1)根据微体系结构的各种最新进展,包括更深层次的存储器层次结构,多线程技术,(短向量)SIMD化技术和硬件预取,并提出了可在生产编译器中为一系列处理器实现的通用算法。 (2)我们在成本模型中建议明智的启发式方法,以选择好的语句进行融合,并改进相关性分析,以在应用程序不存在时消除关键的融合机会。 (3)本文的最后贡献是对不可扩展性问题的解决。基于程序语义,我们设计了一种以更少的代表性语句和依赖性表示整个程序的方法,从而显着改善了编译时间和编译所需的内存。因此,现在不仅可以高效地优化实际应用程序,而且还可以以非常低的开销对其进行优化。

著录项

  • 作者

    Mehta, Sanyam.;

  • 作者单位

    University of Minnesota.;

  • 授予单位 University of Minnesota.;
  • 学科 Computer Science.;Information Science.
  • 学位 Ph.D.
  • 年度 2014
  • 页码 172 p.
  • 总页数 172
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号