Generating optimal CUDA sparse matrix-vector product implementations for evolving GPU hardware

Ahmed H. El Zein; Alistair P. Rendell

首页> 外文期刊>Concurrency and Computation >Generating optimal CUDA sparse matrix-vector product implementations for evolving GPU hardware

【24h】

Generating optimal CUDA sparse matrix-vector product implementations for evolving GPU hardware

机译：生成用于不断发展的GPU硬件的最佳CUDA稀疏矩阵矢量乘积实现

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The CUDA model for graphics processing units (GPUs) presents the programmer with a plethora of different programming options. These includes different memory types, different memory access methods and different data types. Identifying which options to use and when is a non-trivial exercise. This paper explores the effect of these different options on the performance of a routine that evaluates sparse matrix-vector products (SpMV) across three different generations of NVIDIA GPU hardware. A process for analysing performance and selecting the subset of implementations that perform best is proposed. The potential for mapping sparse matrix attributes to optimal CUDA SpMV implementations is discussed.

机译：用于图形处理单元（GPU）的CUDA模型为程序员提供了许多不同的编程选项。这些包括不同的内存类型，不同的内存访问方法和不同的数据类型。确定使用哪些选项以及何时使用是不平凡的练习。本文探讨了这些不同选项对例程性能的影响，该例程评估了三代不同NVIDIA GPU硬件之间的稀疏矩阵矢量积（SpMV）。提出了一种用于分析性能并选择性能最佳的实现子集的过程。讨论了将稀疏矩阵属性映射到最佳CUDA SpMV实现的潜力。

著录项

来源
《Concurrency and Computation》 |2012年第1期|p.3-13|共11页
作者
Ahmed H. El Zein; Alistair P. Rendell;
展开▼
作者单位

ANU Supercomputer Facility, Leonard Huxley Building (#56), The Australian National University, Canberra, ACT 0200, Australia;

School of Computer Science, The Australian National University, Canberra, ACT 0200, Australia;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
CUDA; GPU; NVIDIA; sparse; matrix-vector; fermi; S2050;

机译：CUDA;GPU;NVIDIA;疏;矩阵向量费米S2050;

相似文献

外文文献
中文文献
专利

1. CUDA GPU libraries and novel sparse matrix-vector multiplication - implementation and performance enhancement in unstructured finite element computations [J] . Richard Haney, Ram Mohan International Journal of Computational Science and Engineering . 2019,第4期

机译：CUDA GPU库和新型稀疏矩阵 - 矢量乘法 - 非结构化有限元计算中的实现和性能增强
2. CUDA-enabled Sparse Matrix-Vector Multiplication on GPUs using atomic operations [J] . Hoang-Vu Dang, Bertil Schmidt Parallel Computing . 2013,第11期

机译：使用原子运算在GPU上启用CUDA的稀疏矩阵向量乘法
3. The Sliced COO Format for Sparse Matrix-Vector Multiplication on CUDA-enabled GPUs [J] . Hoang-Vu Dang, Bertil Schmidt Procedia Computer Science . 2012,第1期

机译：启用CUDA的GPU上稀疏矩阵向量乘法的切片COO格式
4. From Sparse Matrix to Optimal GPU CUDA Sparse Matrix Vector Product Implementation [C] . El Zein Ahmed H., Rendell Alistair P. 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing . 2010

机译：从稀疏矩阵到最佳GPU CUDA稀疏矩阵矢量乘积实现
5. Exploring the potential for accelerating sparse matrix-vector product on a Processing-in-Memory architecture [D] . Youssefi, Annahita 2009

机译：探索在内存中处理架构上加速稀疏矩阵矢量乘积的潜力
6. CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment [O] . Svetlin A Manavski, Giorgio Valle 2008

机译：兼容CUDA的GPU卡可作为Smith-Waterman序列比对的高效硬件加速器
7. The Sliced COO Format for Sparse Matrix-Vector Multiplication on CUDA-enabled GPUs [O] . Dang Hoang-Vu, Schmidt Bertil 2012

机译：启用CUDA的GPU上稀疏矩阵向量乘法的切片COO格式

Generating optimal CUDA sparse matrix-vector product implementations for evolving GPU hardware

摘要

著录项

相似文献

相关主题

期刊订阅