...
首页> 外文期刊>Journal of Fluids Engineering: Transactions of the ASME >Computational Fluid Dynamics Computations Using a Preconditioned Krylov Solver on Graphical Processing Units
【24h】

Computational Fluid Dynamics Computations Using a Preconditioned Krylov Solver on Graphical Processing Units

机译:在图形处理单元上使用预处理的Krylov解算器进行计算流体动力学计算

获取原文
获取原文并翻译 | 示例
           

摘要

Graphical processing unit (GPU) computation in recent years has seen extensive growth due to advancement in both hardware and software stack. This has led to increase in the use of GPUs as accelerators across a broad spectrum of applications. This work deals with the use of general purpose GPUs for performing computational fluid dynamics (CFD) computations. The paper discusses strategies and findings on porting a large multifunctional CFD code to the GPU architecture. Within this framework, the most compute intensive segment of the software, the BiCGStab linear solver using additive Schwarz block preconditioners with point Jacobi iterative smoothing is optimized for the GPU platform using various techniques in CUDA Fortran. Representative turbulent channel and pipe flow are investigated for validation and benchmarking purposes. Both single and double precision calculations are highlighted. For a modest single block grid of 64 x 64 x 64, the turbulent channel flow computations showed a speedup of about eightfold in double precision and more than 13-fold for single precision on the NVIDIA Tesla GPU over a serial run on an Intel central processing unit (CPU). For the pipe flow consisting of 1.78 x 10(6) grid cells distributed over 36 mesh blocks, the gains were more modest at 4.5 and 6.5 for double and single precision, respectively.
机译:近年来,由于硬件和软件堆栈的进步,图形处理单元(GPU)的计算得到了广泛的增长。这导致GPU在众多应用程序中作为加速器的使用增加。这项工作涉及使用通用GPU执行计算流体动力学(CFD)计算。本文讨论了将大型多功能CFD代码移植到GPU架构的策略和发现。在此框架内,该软件是计算量最大的部分,使用加性Schwarz块预处理器和点Jacobi迭代平滑的BiCGStab线性求解器已针对CUDA Fortran中的各种技术针对GPU平台进行了优化。为了验证和确定基准,研究了代表性的湍流通道和管道流量。单精度和双精度计算都突出显示。对于64 x 64 x 64的适度单块网格,湍流通道流量计算显示,在NVIDIA Tesla GPU上,在英特尔中央处理器上进行串行运行时,双精度的速度提高了约8倍,单精度的速度提高了13倍以上。单位(CPU)。对于由分布在36个网格块上的1.78 x 10(6)网格单元组成的管道流,双精度和单精度的增益分别为4.5和6.5。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号