Performance Optimization for SpMV on Multi-GPU Systems Using Threads and Multiple Streams

机译：使用线程和多流的多GPU系统上SpMV的性能优化

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Sparse matrix-vector multiplication (SpMV) is a key operation in scientific computing and engineering ap-plications. This paper presents an optimization strategy to improve SpMV performance on the multi-GPU systems by adopting OpenMP threads and multiple CUDA streams. We propose an efficient scheme to control multiple GPUs jointly complete SpMV computations by making use of OpenMP threads. Moreover, we adopt streamed approach to increase concurrency to further improve SpMV performance. In our paper, we use HYB (Hybrid ELL/COO), a hybrid sparse storage format, to demonstrate the effectiveness of our proposed approach. Our experimental results show that our approach achieves an average speedup of 3.80 over the existing SpMV implementation on a single GPU.

机译：稀疏矩阵向量乘法（SpMV）是科学计算和工程应用中的关键操作。本文提出了一种优化策略，以通过采用OpenMP线程和多个CUDA流来提高多GPU系统上的SpMV性能。我们提出了一种有效的方案，通过使用OpenMP线程来控制多个GPU共同完成SpMV计算。此外，我们采用流式方法来增加并发性，以进一步提高SpMV性能。在本文中，我们使用混合稀疏存储格式HYB（混合ELL / COO）来证明我们提出的方法的有效性。我们的实验结果表明，与在单个GPU上现有的SpMV实现相比，我们的方法可实现平均3.80的加速。

著录项

来源
《International Symposium on Computer Architecture and High Performance Computing Workshop》|2016年|67-72|共6页
会议地点
作者
Ping Guo; Changjiang Zhang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Graphics processing units; Sparse matrices; Kernel; Instruction sets; Arrays; Optimization;

机译：图形处理单元;稀疏矩阵;内核;指令集;数组;优化;

相似文献

外文文献
中文文献
专利

1. PELLR: A Permutated ELLPACK-R Format for SpMV on GPUs [J] . Zhiqi Wang, Tongxiang Gu 电脑和通信（英文） . 2020,第004期
2. Multiple stream job performance optimization with source operator graph transformations [J] . Miyuru Dayarathna, Toyotaro Suzumura Concurrency, practice and experience . 2020,第16期

机译：源操作员图形转换多流作业性能优化
3. Performance Optimization of a Distributed Transcoding System Based on Hadoop for Multimedia Streaming Services [J] . Myoungjin Kim, Yun Cui, Seungho Han, Advanced Science Letters . 2014,第10a12期

机译：基于Hadoop的多媒体流服务分布式代码转换系统的性能优化。
4. A new multiplexed optimization with enhanced performance for complex air conditioning systems [J] . Chen Jiayu, Sun Yongjun Energy and Buildings . 2017,第deca期

机译：针对复杂空调系统的新的多路复用优化技术，具有增强的性能
5. Performance Optimization for SpMV on Multi-GPU Systems Using Threads and Multiple Streams [C] . Ping Guo, Changjiang Zhang International Symposium on Computer Architecture and High Performance Computing Workshops . 2016

机译：使用线程和多流的多GPU系统SPMV的性能优化
6. Automated simulation optimization of systems with multiple performance measures through preference modeling. [D] . Rosen, Scott L. 2003

机译：通过偏好建模对具有多种性能指标的系统进行自动仿真优化。
7. High performance MRI simulations of motion on multi-GPU systems [O] . Christos G Xanthis, Ioannis E Venetis, Anthony H Aletras 2014

机译：多GPU系统上运动的高性能MRI仿真
8. Effective multi-GPU communication using multiple CUDA streams and threads [O] . Mohammed Sourouri, Tor Gillberg, Scott B. Baden, 2014

机译：使用多个CUDA流和线程的有效多GPU通信
9. Performance Study and Dynamic Optimization Design for Thread Pool Systems [R] . Xu, D. 2004

机译：线程池系统性能研究与动态优化设计

Performance Optimization for SpMV on Multi-GPU Systems Using Threads and Multiple Streams

摘要

著录项

相似文献

相关主题

期刊订阅