Scalable and Modular Algorithms for Floating-Point Matrix Multiplication on Reconfigurable Computing Systems

Ling Zhuo; Viktor K. Prasanna

首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Scalable and Modular Algorithms for Floating-Point Matrix Multiplication on Reconfigurable Computing Systems

【24h】

Scalable and Modular Algorithms for Floating-Point Matrix Multiplication on Reconfigurable Computing Systems

机译：可重构计算系统上浮点矩阵乘法的可扩展和模块化算法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The abundant hardware resources on current reconfigurable computing systems provide new opportunities for high-performance parallel implementations of scientific computations. In this paper, we study designs for floating-point matrix multiplication, a fundamental kernel in a number of scientific applications, on reconfigurable computing systems. We first analyze design trade-offs in implementing this kernel. These trade-offs are caused by the inherent parallelism of matrix multiplication and the resource constraints, including the number of configurable slices, the size of on-chip memory, and the available memory bandwidth. We propose three parameterized algorithms which can be tuned according to the problem size and the available hardware resources. Our algorithms employ a linear array architecture with simple control logic. This architecture effectively utilizes the available resources and reduces routing complexity. The Processing Elements (PEs) used in our algorithms are modular so that it is easy to embed floating-point units into them. Experimental results on a Xilinx Virtex-II Pro XC2VP100 show that our algorithms achieve good scalability and high sustained GFLOPS performance. We also implement our algorithms on Cray XD1. XD1 is a high-end reconfigurable computing system that employs both general-purpose processors and reconfigurable devices. Our algorithms achieve a sustained performance of 2.06 GFLOPS on a single node of XD1.

机译：当前可重构计算系统上的大量硬件资源为科学计算的高性能并行实现提供了新的机会。在本文中，我们研究了可重配置计算系统上浮点矩阵乘法的设计，这是许多科学应用中的基本内核。我们首先分析实现此内核时的设计权衡。这些折衷是由矩阵乘法的固有并行性和资源限制（包括可配置片的数量，片上存储器的大小以及可用的存储器带宽）引起的。我们提出了三种可以根据问题大小和可用硬件资源进行调整的参数化算法。我们的算法采用具有简单控制逻辑的线性阵列架构。该体系结构有效地利用了可用资源并降低了路由复杂性。我们的算法中使用的处理元素（PE）是模块化的，因此很容易将浮点单元嵌入其中。在Xilinx Virtex-II Pro XC2VP100上的实验结果表明，我们的算法实现了良好的可伸缩性和较高的GFLOPS持续性能。我们还在Cray XD1上实现了算法。 XD1是同时使用通用处理器和可重配置设备的高端可重配置计算系统。我们的算法在XD1的单个节点上实现了2.06 GFLOPS的持续性能。

著录项

来源
《IEEE Transactions on Parallel and Distributed Systems》 |2007年第2007期|p.433-448|共16页
作者
Ling Zhuo; Viktor K. Prasanna;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
Scientific computing; computations on matrices; field-programmable gate arrays; parallel algorithms.; reconfigurable hardware;

机译：科学计算;矩阵计算;现场可编程门阵列;并行算法;可重配置硬件;

相似文献

外文文献
中文文献
专利

1. Scalable and Modular Algorithms for Floating-Point Matrix Multiplication on Reconfigurable Computing Systems [J] . Ling Zhuo, Prasanna V.K. IEEE Transactions on Parallel and Distributed Systems . 2007,第4期

机译：可重构计算系统上浮点矩阵乘法的可扩展和模块化算法
2. Degree of scalability: scalable reconfigurable mesh algorithms for multiple addition and matrix-vector multiplication [J] . Ramachandran Vaidyanathan, Jerry L. Trahan, Chun-ming Lu Parallel Computing . 2003,第1期

机译：可扩展性：可扩展的可重构网格算法，用于多重加法和矩阵矢量乘法
3. Fast and processor efficient parallel matrix multiplication algorithms on a linear array with a reconfigurable pipelined bus system [J] . Keqin Li, Yi Pan IEEE Transactions on Parallel and Distributed Systems . 1998,第8期

机译：具有可重配置流水线总线系统的线性阵列上的快速且处理器高效的并行矩阵乘法算法
4. Scalable and modular algorithms for floating-point matrix multiplication on FPGAs [C] . Zhuo, L., Prasanna, . 2004

机译：FPGA上用于浮点矩阵乘法的可扩展和模块化算法
5. Analysis-Driven Design of Parallel Floating-Point Matrix Multiplication for Implementation in Reconfigurable Logic. [D] . Khayyat, Ahmad. 2013

机译：分析驱动设计的可重配置逻辑中的并行浮点矩阵乘法。
6. 3D Printed Reconfigurable Modular Microfluidic System for Generating Gel Microspheres [O] . Xiaojun Chen, Deyun Mo, Manfeng Gong 2020

机译：用于生成凝胶微球的3D打印可重配置模块化微流体系统
7. Fast and processor efficient parallel matrix multiplication algorithms on a linear array with a reconfigurable pipelined bus system [O] . Keqin Li, Senior Member, Yi Pan, 1998

机译：具有可重新配置的流水线总线系统的线性阵列上的快速且处理器有效的并行矩阵乘法算法
8. EPIQ - A Meta-Computing Framework for Scalable, Responsive and Reconfigurable End-to-End Resource Management, and Agile Objects: Middleware for Survivable Information Systems [R] . Nahrstedt, K. 2003

机译：EpIQ - 可扩展，响应和可重新配置的端到端资源管理和敏捷对象的元计算框架：可生存信息系统的中间件

Scalable and Modular Algorithms for Floating-Point Matrix Multiplication on Reconfigurable Computing Systems

摘要

著录项

相似文献

相关主题

期刊订阅