【24h】

Speeding Up Stencil Computations with Kernel Convolution

机译:通过内核卷积加速模板计算

获取原文

摘要

A technique to speed up stencil computation is introduced. Computation and data reuse schemes are developed for its application to 1- and 3-dimensional stencils. The approach traverses the data domain fewer times than a state-of-the-art, straightforward iterative stencil implementation would. Performance results are shown for a variety of platforms, exemplifying how it can be straightforwardly applied with existing techniques and frameworks. The technique, named Aggregate Stencil-Loop Iteration (ASLI), works by applying a stencil obtained by the original stencil operator convolved with itself one or more times. This more complex operator creates new opportunities for in-register data reuse and increases the FLOPs-to-load ratio. The total number of FLOPs decreases for 1D but increases for 2D and 3D star-shaped stencils. In both scenarios, speed-up relative to the state-of-the-art is achieved. ASLI is relatively easy to implement and works synergistically with existing methods to optimize stencil computations.
机译:介绍了一种加快模板计算速度的技术。开发了计算和数据重用方案,以将其应用于一维和三维模板。该方法比最先进的,直接的迭代模板实现遍历数据域的次数更少。显示了针对各种平台的性能结果,举例说明了如何将其直接应用到现有技术和框架中。该技术名为“聚合模板循环迭代(ASLI)”,其工作原理是应用由原始模板操作员对其自身进行一次或多次卷积而获得的模板。这个更复杂的运营商为寄存器内数据重用创造了新的机会,并提高了FLOP与负载的比率。对于1D,FLOP的总数减少,但对于2D和3D星形模板,FLOP的总数增加。在这两种情况下,都可以实现相对于最新技术的加速。 ASLI相对容易实现,并且与现有方法协同工作以优化模板计算。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号