Computing programs containing band linear recurrences on vector supercomputers

Haigeng Wang; Nicolau A.

首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Computing programs containing band linear recurrences on vector supercomputers

【24h】

Computing programs containing band linear recurrences on vector supercomputers

机译：向量超级计算机上包含带线性递归的计算程序

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Many large-scale scientific and engineering computations, e.g., some of the Grand Challenge problems, spend a major portion of execution time in their core loops computing band linear recurrences (BLRs). Conventional compiler parallelization techniques cannot generate scalable parallel code for this type of computation because they respect loop-carried dependences (LCDs) in programs, and there is a limited amount of parallelism in a BLR with respect to LCDs. For many applications, using library routines to replace the core BLR requires the separation of BLR from its dependent computation, which usually incurs significant overhead. In this paper, we present a new scalable algorithm called the Regular Schedule, for parallel evaluation of BLRs. We describe our implementation of the Regular Schedule and discuss how to obtain maximum memory throughput in implementing the schedule on vector supercomputers. We also illustrate our approach, based on our Regular Schedule, to parallelizing programs containing BLR and other kinds of code. Significant improvements in CPU performance for a range of programs containing BLR implemented using the Regular Schedule in C over the same programs implemented using highly optimized coded-in-assembly BLAS routines [11] are demonstrated on Convex C240. Our approach can be used both at the user level in parallel programming code containing BLRs, and in compiler parallelization of such programs combined with recurrence recognition techniques for vector supercomputers.

机译：许多大规模的科学和工程计算，例如一些重大挑战问题，其执行时间的大部分时间都在其核心循环中计算带线性递归（BLR）。常规的编译器并行化技术不能为这种类型的计算生成可伸缩的并行代码，因为它们遵循程序中的循环承载依赖性（LCD），并且BLR中相对于LCD的并行性数量有限。对于许多应用程序，使用库例程替换核心BLR要求将BLR与它的从属计算分开，这通常会产生大量开销。在本文中，我们提出了一种新的可扩展算法，称为常规调度，用于并行评估BLR。我们描述常规调度的实现，并讨论如何在向量超级计算机上实现调度时获得最大的内存吞吐量。我们还将根据常规时间表说明我们的方法，该方法用于并行化包含BLR和其他类型代码的程序。在Convex C240上展示了与使用高度优化的程序集内编码BLAS例程[11]实施的相同程序相比，使用C的常规调度实现的包含BLR的一系列程序的CPU性能的显着改善。我们的方法既可以在用户级别用于包含BLR的并行编程代码中，又可以在此类程序的编译器并行化中与矢量超级计算机的递归识别技术结合使用。

著录项

来源
《IEEE Transactions on Parallel and Distributed Systems》 |1996年第8期|P.769-782|共14页
作者
Haigeng Wang; Nicolau A.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Vectors of Identity Matrix are the only Solution for a Special Linear System of Equations and Linear Programming Problems [J] . Enagandula Prasad International Journal of Applied Engineering Research . 2019,第1aPta1期

机译：身份矩阵的载体是特殊线性系统的唯一方程式和线性编程问题的解决方案
2. A 28-nm Compute SRAM With Bit-Serial Logic/Arithmetic Operations for Programmable In-Memory Vector Computing [J] . Wang Jingcheng, Wang Xiaowei, Eckert Charles, IEEE Journal of Solid-State Circuits . 2020,第1期

机译：28-NM计算SRAM，具有用于可编程内存矢量计算的位串行逻辑/算术操作
3. Equivalence between polyhedral projection, multiple objective linear programming and vector linear programming [J] . Loehne Andreas, Weissing Benjamin Mathematical methods of operations research . 2016,第2期

机译：多面体投影，多目标线性规划和矢量线性规划之间的等价关系
4. Scalable techniques for computing band linear recurrences on massively parallel and vector supercomputers [C] . Haigeng Wang, Nicolau, A. . 1994

机译：在大规模并行和矢量超级计算机上计算带线性递归的可扩展技术
5. A unified compiler framework for program analysis, optimization, and automatic vectorization with chains of recurrences [D] . Shou, Yixin 2009

机译：统一的编译器框架，用于程序分析，优化和带有循环链的自动矢量化
6. A comparison of univariate vector bilinear autoregressive and band power features for brain–computer interfaces [O] . Clemens Brunner, Martin Billinger, Carmen Vidaurre, -1

机译：脑机接口单变量矢量双线性自回归和带幂函数的比较
7. Computing Programs Containing Band Linear Recurrences on Vector Supercomputers [O] . Haigeng Wang, Alexandru Nicolau 1992

机译：向量超级计算机上包含带线性递归的计算程序
8. Design and Implementation of Cost-Effective Algorithms for Direct Solution of Banded Linear Systems on the Vector Processor System 32 Supercomputer [R] . Samba, A. S. 1985

机译：矢量处理器系统32超级计算机直接求解带状线性系统的经济有效算法设计与实现

Computing programs containing band linear recurrences on vector supercomputers

摘要

著录项

相似文献

相关主题

期刊订阅