首页> 外文会议>AIAA aerospace sciences meeting;AIAA SciTech Forum >OpenACC directive-based GPU acceleration of an implicit reconstructed discontinuous Galerkin method for compressible flows on 3D unstructured grids
【24h】

OpenACC directive-based GPU acceleration of an implicit reconstructed discontinuous Galerkin method for compressible flows on 3D unstructured grids

机译:基于OpenACC指令的GPU加速的3D非结构化网格上可压缩流的隐式重构不连续Galerkin方法

获取原文

摘要

Despite of the increasing popularity of OpenACC directive-based acceleration for computational fluid dynamics (CFD) codes using the general-purpose graphics processing units (GPGPUs), an efficient implicit algorithm for high-order method on unstructured grids is still a relatively unexplored area. This is mainly due to the fact that, the capacity of local cache memory of a top-notch GPGPU is still far behind a common CPU. Thus many state-of-the-art preconditioning algorithms (e.g. the Symmetric Gauss-Seidel (SGS) and Lower Upper-Symmetric Gauss-Seidel (LU-SGS)), in which the matrix and strongly inherent data dependent operations are heavily involved, become extremely inefficient because of the local cache memory bound, when simply ported onto GPGPUs. In the present study, an efficient implicit algorithm for a GPGPU accelerated reconstructed discontinuous Galerkin (DG) CFD code is introduced and assessed for the solution of the Euler equations on unstructured grids. The block matrix operations are refined to element level. A Gauss-Jordan elimination based matrix inversion algorithm is adopted to optimize the performance on GPU platform. For SGS-type linear solver/preconditioner, a straightforward element reordering algorithm is employed to eliminate data dependency. As a result, the developed algorithm is implemented on GPGPU to accelerate a high-order implicit reconstructed discontinuous Galerkin (rDG) method as a compressible flow solver on 3D unstructured grids. Several numerical tests are carried out to obtain the speed up factor as well as the parallel efficiency, which indicates that the presented algorithm is able to offer low-overhead concurrent CFD simulation on unstructured grids on NVIDIA GPGPUs.
机译:尽管使用通用图形处理单元(GPGPU)的基于OpenACC指令的用于计算流体力学(CFD)代码的加速越来越流行,但针对非结构化网格的高阶方法的有效隐式算法仍然是一个相对未开发的领域。这主要是由于以下事实:顶级GPGPU的本地缓存内存仍然远远落后于普通CPU。因此,许多先进的预处理算法(例如对称高斯-赛德尔(SGS)和下对称高斯-赛德尔(LU-SGS)),其中大量涉及矩阵和与数据相关的固有运算,当简单地移植到GPGPU上时,由于绑定了本地缓存存储器,导致效率极低。在本研究中,针对GPGPU加速重建的不连续Galerkin(DG)CFD代码,引入了一种有效的隐式算法,并对非结构化网格上的Euler方程进行了求解。块矩阵运算被细化到元素级别。采用基于Gauss-Jordan消除的矩阵求逆算法优化GPU平台性能。对于SGS型线性求解器/预处理器,采用了一种简单的元素重排序算法来消除数据依赖性。结果,在GPGPU上实现了开发的算法,以加速作为3D非结构化网格上的可压缩流求解器的高阶隐式重建不连续伽勒金(rDG)方法。为了获得加速因子和并行效率,进行了一些数值测试,这表明所提出的算法能够在NVIDIA GPGPU上的非结构化网格上提供低开销的并发CFD仿真。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号