【24h】

Static Cost Estimation for Data Layout Selection on GPUs

机译:GPU上数据布局选择的静态成本估算

获取原文
获取原文并翻译 | 示例

摘要

Performance modeling provides mathematical models and quantitative analysis for designing and optimizing computer systems. In high performance architectures, high-latency memory accesses often dominate execution time in many classes of applications. Thus, performance modeling for memory accesses of high performance architectures has been an important research topic. In high performance computation, data layout can significantly affect the efficiency of memory access operations. In recent years, the problem of data layout selection has been well studied on various parallel CPU and some GPU architectures. GPUs have memory hierarchies different from multi-core CPUs. While data layout selection on GPUs has been inspected by several existing projects, there is still a lack of a mathematical cost model for data layout selection on GPUs. This motivates us to investigate static cost analysis methods that could better guide future data layout selection work, and perhaps even designing new SIMT architectures.In this paper, we propose a comprehensive cost analysis for data layout selection for GPUs. We build our cost function based on the knowledge of the GPU memory hierarchy, and develop an algorithm which allows researchers to perform compile time cost estimation for a given data layout. Furthermore, we introduce a new vector based representation to represent the estimated cost, which can better estimate the cost of applications with dynamic length loops. We apply our cost analysis to selected benchmarks from past publications on data layout selection. Our experimental results show that our cost analysis can accurately predict the relative costs of different data layouts. Using the cost model presented in this paper, we are developing an automatic data layout selection tool in our ongoing work.
机译:性能建模为设计和优化计算机系统提供数学模型和定量分析。在高性能体系结构中,高延迟内存访问通常在许多类应用程序中占据着执行时间。因此,用于高性能架构的存储器访问的性能建模已经成为重要的研究课题。在高性能计算中,数据布局会严重影响内存访问操作的效率。近年来,已经在各种并行CPU和某些GPU架构上对数据布局选择问题进行了深入研究。 GPU具有不同于多核CPU的内存层次结构。尽管已有多个项目检查了GPU上的数据布局选择,但仍缺乏用于GPU上数据布局选择的数学成本模型。这促使我们研究静态成本分析方法,以更好地指导未来的数据布局选择工作,甚至可能设计新的SIMT架构。本文为GPU的数据布局选择提出了全面的成本分析。我们基于对GPU内存层次结构的了解来构建成本函数,并开发一种算法,该算法可使研究人员针对给定的数据布局执行编译时成本估算。此外,我们引入了一种新的基于向量的表示形式来表示估计的成本,它可以更好地估计具有动态长度循环的应用程序的成本。我们将成本分析应用于过去出版物中有关数据布局选择的选定基准。我们的实验结果表明,我们的成本分析可以准确地预测不同数据布局的相对成本。使用本文介绍的成本模型,我们在正在进行的工作中开发了一种自动数据布局选择工具。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号