首页> 外文期刊>International Journal for Numerical Methods in Fluids >Performance of a three-dimensional unstructured mesh compressible flow solver on NVIDIA Fermi-class graphics processing unit hardware
【24h】

Performance of a three-dimensional unstructured mesh compressible flow solver on NVIDIA Fermi-class graphics processing unit hardware

机译:三维非结构化网格可压缩流求解器在NVIDIA Fermi级图形处理单元硬件上的性能

获取原文
获取原文并翻译 | 示例
           

摘要

We describe the performance of CHICOMA, a 3D unstructured mesh compressible flow solver, on graphics processing unit (GPU) hardware. The approach used to deploy the solver on GPU architectures derives from the threaded multicore execution model used in CHICOMA, and attempts to improve memory performance via the application of graph theory techniques. The result is a scheme that can be deployed on the GPU with high-level programming constructs, for example, compiler directives, rather than low-level programming extensions. With an NVIDIA Fermi-class GPU (NVIDIA Corp., Sta. Clara, CA, USA) and double precision floating point arithmetic, we observe performance gains of 4-5x on problem sizes of 10~6 - 10~7 tetrahedra. We also compare GPU performance to threaded multicore performance with OpenMP and demonstrate hybrid multicore-GPU calculations with adaptive mesh refinement Published 2012. This article is a US Government work and is in the public domain in the USA.
机译:我们描述了CHICOMA(3D非结构化网格可压缩流求解器)在图形处理单元(GPU)硬件上的性能。用于在GPU架构上部署求解器的方法源自CHICOMA中使用的线程多核执行模型,并尝试通过图论技术的应用来提高内存性能。结果是可以使用高级编程结构(例如编译器指令)而不是低级编程扩展将方案部署在GPU上。使用NVIDIA Fermi级GPU(美国加利福尼亚州圣克拉拉市的NVIDIA Corp.)和双精度浮点算法,在问题尺寸为10〜6-10〜7四面体时,我们观察到性能提高了4-5倍。我们还将OpenMP的GPU性能与线程多核性能进行了比较,并演示了具有自适应网格细化功能的混合多核GPU计算。2012年发布。本文是美国政府的工作,在美国属于公共领域。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号